Quick Start

Deploy the server

Create a table

Insert sample data

Run example queries

1. Deploy the server

Select the Scale plan for a 14-day free trial with a $300 usage credit. A valid credit card is required to verify the account.

To deploy a cloud database, navigate to https://platform.hydra.so/projects. For a detailed walkthrough, check out the Deploying Hydra Cloud guide.

2. Create a table

We will connect to Hydra using psql, the terminal-based front-end to Postgres, but you can use any Postgres client you’d prefer and achieve the same result. We will start by creating our analytics schema with a table “analytics.rides” which we will populate with sample data from the New York Taxi data in step 3.

CREATE SCHEMA analytics;
CREATE TABLE analytics.rides
(
	vendor_id text,
	pickup_datetime timestamp,
	dropoff_datetime timestamp,
	passenger_count int,
	trip_distance float,
	pickup_longitude float,
	pickup_latitude float,
	rate_code text,
	store_and_fwd_flag text,
	dropoff_longitude float,
	dropoff_latitude float,
	fare_amount float,
	surcharge float,
	mta_tax float,
	tip_amount float,
	tolls_amount float,
	total_amount float
);

3. Insert sample data

Let’s insert the New York Taxi data into our rides table in the analytics schema. Below we’ve written example queries you can copy / paste.

INSERT INTO analytics.rides
SELECT 
    r['vendor_id']::text,
    r['pickup_datetime']::timestamp,
    r['dropoff_datetime']::timestamp,
    r['passenger_count']::int,
    r['trip_distance']::float,
    r['pickup_longitude']::float,
    r['pickup_latitude']::float,
    r['rate_code']::text,
    r['store_and_fwd_flag']::text,
    r['dropoff_longitude']::float,
    r['dropoff_latitude']::float,
    r['fare_amount']::float,
    r['surcharge']::float,
    r['mta_tax']::float,
    r['tip_amount']::float,
    r['tolls_amount']::float,
    r['total_amount']::float
FROM read_parquet('https://idl.uw.edu/mosaic-datasets/data/nyc-rides-2010.parquet') AS r;

4. Run example queries

Q0: SELECT COUNT(*)

SELECT count(*) FROM analytics.rides;

SELECT COUNT(*) … counts the number of rows in the analytics.rides table.

Q1: SELECT * FROM

SELECT * FROM analytics.rides LIMIT 10;

SELECT * FROM … retrieves up to 10 rows of data from the analytics.rides table.

Q2: SELECT vendor_id

SELECT
   vendor_id,
   count(*) AS count,
   sum(fare_amount) AS revenue
FROM analytics.rides
GROUP BY vendor_id
ORDER BY count DESC
LIMIT 10;

SELECT vendor_id … is list of the top 10 vendors, displaying the number of rides and total revenue for each, ordered by the greatest number of rides first.

When using cloud hosted Hydra run each statement by clicking the green triangle next to each query in the SQL Editor.

5. Insert existing data

Now let’s test Hydra with your own data.

It’s a good idea to drop the “analytics.rides” sample data that was inserted earlier before loading your own files from S3, PostgreSQL, Google Cloud Storage, or local files.

To delete the sample data:

DROP TABLE analytics.rides;

Note: Switch the codebox tabs for steps on S3, Postgres, and/or local data.

-- Connect bucket, Session token is optional
INSERT INTO duckdb.secrets
 (type, key_id, secret, session_token, region)
 VALUES ('S3', 'access_key_id', 'secret_access_key', 'session_token', 'us-east-1');

-- example: create a table in the analytics schema from a file in S3.
CREATE TABLE analytics.your_table_name 
USING duckdb
AS
SELECT *
FROM read_parquet('s3://your_bucket_name/your_file_name.parquet') AS
(
   -- specify columns and types here, e.g.:
   -- id int,
   -- name text, ...
);

What’s Next?

We recommend

following the migration documentation from Object Storage (S3, GCS), Amazon RDS, Heroku, Render, Postgres.
learning about Hydra, our team and partners behind the project
scanning the Hydra architecture

Introduction

Guides

Operating Hydra

Resources

1. Deploy the server

2. Create a table

3. Insert sample data

4. Run example queries

5. Insert existing data

What’s Next?

Introduction

Guides

Operating Hydra

Resources

​1. Deploy the server

​2. Create a table

​3. Insert sample data

​4. Run example queries

​5. Insert existing data

​What’s Next?

1. Deploy the server

2. Create a table

3. Insert sample data

4. Run example queries

5. Insert existing data

What’s Next?