Follow the quickstart guide for the best practices on setting up Hydra — add realtime analytics to any project in less than a day.

Sign-up

  • To get started, please sign up at https://www.hydra.so/
  • Check your email. Once approved, select “Sign Up” below listed in the purple link

Hydra Postgres

  • Every Production Hydra Account includes 100 cloud compute hours, 10GB on-disk storage, and 100GB of Hydra’s globally-distributed data lake storage. These specs and limits can be modified at any time.
  • Hydra selects the latest supported version of Postgres by default. This setting can be modified during setup or at any point afterward.
  • Hydra selects US-East by default. If you’d like a different region for your Postgres database, please contact our support team.
  • Hydra selects 10GB on-disk storage by default, but that’s available for editing at any point. Note, on-disk storage can not be scaled down to a smaller footprint once provisioned per instance.

Once Completed, your Postgres instance will be listed as “Running”, showing both the database specifications and the Postgres connection string.

Enable DuckDB processing

DuckDB execution is enabled automatically when needed: for example, whenever you use DuckDB functions (such as read_csv), when you query DuckDB tables, and when running COPY table TO 's3://...'. However, if you want queries which only touch Postgres tables to use DuckDB execution you need to run SET duckdb.force_execution TO true’. This feature is opt-in. To avoid doing that for every session, you can configure it for a certain user by doing ALTER USER my_analytics_user SET duckdb.force_execution TO true.

Load Data into Hydra Postgres

The following setup instructions are for populating data into Hydra Postgres from an existing Postgres database. View the detailed guides for migrating data from Heroku, Render, AWS RDS, and self-managed Postgres. If you do not have an existing database, jump ahead to the next step or load sample data in “playing_with_hydra” example below.

Capture a backup file of your existing database. You will need to know the hostname, username, password, and database name of the database you wish to capture.

pg_dump -Fc --no-acl --no-owner \
    -h your.db.hostname \
    -U username \
    databasename > mydb.dump

Restoring data into Hydra

Using the captured data backup, you can copy your data into Hydra using pg_restore as follows. You will need the hostname, username, password, and database name of your Hydra database. You can find these on the Hydra dashboard.

pg_restore --verbose --clean --no-acl --no-owner \
    -h hydra-hostname.fly.hydradb.io \
    -U username \
    -d databasename \
    mydb.dump

Using Sample data

You can use the following queries in the SQL Editor tab. The queries creates a table, adds data, and retrieves the data from the table.

CREATE TABLE IF NOT EXISTS playing_with_hydra (
	id SERIAL PRIMARY KEY,
	name TEXT NOT NULL, value REAL
);

INSERT INTO playing_with_hydra(name, value)
SELECT LEFT(md5(i::TEXT), 10), random()
FROM generate_series(1, 10) s(i);

SELECT * FROM playing_with_hydra;

Run each statement by clicking the green triangle next to each query.

Connect Data Lake Storage

To enable a data lake, you can use Hydra’s globally-distributed data lake, AWS S3, GCP Cloud Storage, or Cloudflare R2. If you’d like Hydra to support any additional storage vendors, please contact our support team.

  • Connect to Postgres using your Hydra Postgres connection string.
  • Adding a secret to the duckdb.secrets table:
INSERT INTO duckdb.secrets
(cloud_type, cloud_id, cloud_secret, cloud_region)
VALUES ('S3', 'access_key_id', 'secret_accss_key', 'us-east-1');

If you would like to try Hydra’s globally distributed storage beta, please contact our support team for setup.

Using pg_duckdb: Data Lake Example

With your Hydra Postgres database now setup, data loaded, and connected with Data Lake storage view our complete guide on Data Lake Example documentation page.

Next Steps

Next, we recommend following the views documentation for best practices. Setting up views and establishing caching with Data Lake storage is a great way to pre-compute and avoid network latency between compute and storage layers.