Home AWS Amazon Redshift

Amazon Redshift

by SupportPRO Admin

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. A data warehouse architecture consists of three tiers. The bottom tier of the architecture is the database server, where data is loaded and stored. The middle tier consists of the analytics engine that is used to access and analyze the data. The top tier is the front-end client that presents results through reporting, analysis, and data mining tools.

  • Currently, there are many data warehouse services like DWaaS, Actian, Amazon Web Services (AWS), Hewlett Packard Enterprise (HPE), IBM, Microsoft, Oracle etc.
  • The AWS Redshift is very easy to use compared to other EDW providers. Amazon Redshift automates the common administrative tasks to help manage, monitor, and scale your data warehouse with push-button simplicity. This eliminates the undifferentiated heavy lifting commonly faced in managing a data warehouse and effectively liberates one to focus on the analytics and core business needs.
  • Compared to more traditional, legacy data warehouses, Amazon Redshift provides a blend of both entry-level affordability and massive cost-efficiency at scale. You can have an unlimited number of users doing unlimited analytics on all your data for just $1,000 per terabyte per year.
  • Amazon Redshift manages provisioning, configuration, and patching. Data durability and availability are assured as well via automatic replication and backup through Amazon S3. Scaling is simplified by simply adding or removing nodes with just a single API call or through the Amazon AWS management console.

How to start using Amazon Redshift?

The major steps in starting to work with AWS Redshift is given below.

  1. Setting up the AWS account from https://aws.amazon.com
  2. Install SQL Client Drivers and Tools (You must install any third-party database tools that you want to use with your clusters; Amazon Redshift does not provide or install any third-party tools or libraries.)
  3. Determine Firewall Rules (Amazon Redshift uses port 5439 by default)
  4. Create an IAM role for the Redshift.
  5. Launching the Redshift Cluster.
  • On the Amazon Redshift Dashboard, choose Launch Cluster.
  • On the Cluster Details page, enter the following values and then choose Continue:
  • Cluster Identifier: type examplecluster.
  • Database Name: leave this box blank. Amazon Redshift will create a default database named dev.
  • Database Port: Type the port number on which the database will accept connections. You should have determined the port number in the prerequisite step of this tutorial. You cannot change the port after launching the cluster, so make sure that you have an open port number in your firewall so that you can connect from SQL client tools to the database in the cluster.
  • Master User Name: type masteruser. You will use this username and password to connect to your database after the cluster is available.
  • Master User Password and Confirm Password: type a password for the master user account.

On the Node Configuration page, select the following values and then choose Continue:

  • Node Type: dc2.large
  • Cluster Type: Single Node

On the Additional Configuration page, you will see different options depending on your AWS account, which determines the type of platform the cluster uses. To keep things simple for this tutorial, you do not need to understand the distinction between these platforms, EC2-Classic and EC2-VPC.

Associate an IAM role with the cluster.

For AvailableRoles, choose myRedshiftRole and then choose Continue.

On the Review page, review the selections that you’ve made and then choose Launch Cluster.

 

Managing the Clusters

There are several ways to manage clusters. If you prefer a more interactive way of managing clusters, you can use the Amazon Redshift console or the AWS Command Line Interface (AWS CLI). If you are an application developer, you can use the Amazon Redshift Query API or the AWS Software Development Kit (SDK) libraries to manage clusters programmatically. If you use the Amazon Redshift Query API, you must authenticate every HTTP or HTTPS request to the API by signing it.

Amazon Redshift manages all the work of setting up, operating and scaling a data warehouse: provisioning capacity, monitoring and backing up the cluster, and applying patches and upgrades to the Amazon Redshift engine.

You can determine the Amazon Redshift engine and database versions for your cluster in the Cluster Version field in the console. The first two sections of the number are the cluster version, and the last section is the specific revision number of the database in the cluster.

An Amazon Redshift cluster consists of nodes. Each cluster has a leader node and one or more compute nodes. The leader node receives queries from client applications, parses the queries, and develops query execution plans. The leader node then coordinates the parallel execution of these plans with the compute nodes and aggregates the intermediate results from these nodes. It then finally returns the results back to the client applications.

Compute nodes execute the query execution plans and transmit data among themselves to serve these queries. The intermediate results are sent to the leader node for aggregation before being sent back to the client applications.

If you require help, contact SupportPRO Server Admin

Server not running properly? Get A FREE Server Checkup By Expert Server Admins - $125 Value

Leave a Comment

CONTACT US

Sales and Support

Phone: 1-(847) 607-6123
Fax: 1-(847)-620-0626
Sales: sales@supportpro.com
Support: clients@supportpro.com
Skype ID: sales_supportpro

Postal Address

1020 Milwaukee Ave, #245,
Deerfield, IL-60015
USA

©2022  SupportPRO.com. All Rights Reserved