Questions tagged [vertica]

Vertica is an MPP column-store analytics database that runs on clusters of Linux servers or in the cloud. The original configuration (Enterprise mode) distributes data and query execution among nodes (typically with duplication). A newer configuration (Eon mode) separates compute from storage and relies on S3 for storage.

Vertica is a columnar database designed for data warehousing. Vertica is based on C-Store, which was an academic project involving academic and entrepreneur Michael Stonebraker (a co-founder of Vertica, Ingres, VoltDB) and colleagues from Brandeis University, Brown University, MIT, and University of Massachusetts Boston.

As part of the C-store architecture table data is stored on disk in columns rather than in rows. Columns are stored independently of other columns and allow for increased compression on storage and decreased I/O during query operations.

Vertica has advanced analytical and machine learning functions built-in.

Vertica is designed to run on clusters of Linux nodes. A Vertica cluster runs in a MPP (Massive Parallel Processing) fashion, and it is designed to scale linearly by adding nodes to the cluster.

Vertica clusters can be deployed on premise (in Enterprise Mode) or in the cloud (in either Enterprise Mode or Eon Mode). Enterprise Mode consists of Vertica nodes that share compute and storage resources. Eon Mode is a cloud based mode in which there is a separation of compute nodes (AWS EC2 instances) and storage (S3). This separation of compute and storage allows for rapid scalability (adding or removing compute nodes) without the need to rebalance data across the nodes in the cluster.

Veritca was purchased by HP in 2011, and then acquired by Micro Focus as part of the HPE merger in 2017. As such, Vertica is sometimes referred to in old documentation and community posts as HP Vertica or HPE Vertica.

Vertica comes with a command line utility called vsql for connecting to the database.

1420 questions
-1
votes
2 answers

Get sum of digits in string of key/value structure

I have strings in vertica database like: 'a18: 2, b34: 5, n29: 10' and i need to extracts digits from it and get sum. So the output should be: 17 (calculated 2+5+10) I need it for working in Tableau. How can i do it?
Chick Chirik
  • 115
  • 1
  • 10
-1
votes
3 answers

Select n distinct values from one column and all other data

I have a table that logs one row of data per user per day with a user_id column. I would like to select all days from n random users, but the table is incredibly large, so I'd like to avoid doing a self join. Right now, I'm doing: SELECT …
jtanman
  • 654
  • 1
  • 4
  • 18
-1
votes
1 answer

How to rollback a Vertica copy from stdin when an upstream error occurs?

On Linux, I have files on a remote system that are too big to hold on my local system so I am using ssh to stream the files to the local system and using Vertica's copy stdin to enter the data. The problem is, occasionally the streaming fails for…
user1683793
  • 1,213
  • 13
  • 18
-1
votes
2 answers

Cross join and aggregation in Vertica DB

I have two tables in Vertica Database 1) cross join of all existing locations and categories location_1 category_1 location_1 category_2 ... location_2 category_1 location_2 category_2 ... location_n category_n-1 location_n category_n 2)…
pesya
  • 43
  • 6
-1
votes
1 answer

Where can we get Resources for vertica DB?

Can i find any resources like PDF or User Guide for learning vertica DB? As i am beginner in vertica, also I am looking for the performance factor which affects while loading the data as well.
Avinash
  • 393
  • 1
  • 4
  • 9
-1
votes
1 answer

Vertica Backup/Restore

I am Using Vertica 9.1.x and trying to setup Backup Policy. Vertica provides Utility vbr to Backup with Limitation of Same Number of Nodes and Same Node name. Are there any other methods which can help in Backing up in such a way so that restore has…
Shrijan Tiwari
  • 673
  • 6
  • 17
-1
votes
1 answer

Formulate millis into HHH:MM:SS

Does anyone know of a way of concatenating millis into an HHH:MM:SS format? Where the hours can go over 24:59:59, but not add another day (as in 1970-01-02 00:00:01). I'd like it to show as 25:00:01 for example. I can split the time up and…
ah1
  • 71
  • 3
  • 9
-1
votes
3 answers

Vertica SQL: "or" in join leads to huge slowdown?

I have a query like select count(distinct tab1.id) from tab1 join tab2 on tab1.email = tab2.email_a but when I change it to select count(distinct tab1.id) from tab1 join tab2 on tab1.email = tab2.email_a or tab1.email = tab2.email_b then suddenly…
Hanmyo
  • 543
  • 4
  • 8
  • 18
-1
votes
2 answers

Creating temp table in Vertica

want to create a temp table in vertica containing lat, lng and restaurant_name I am writing the code as below but it throws an error: CREATE LOCAL TEMP TABLE geo_raw ( lat float, lng float, resto_name varchar) on commit PRESERVE rows INSERT INTO…
Nimish Shah
  • 1
  • 1
  • 1
-1
votes
1 answer

GroupBy query for billion records - Vertica

I am working on an application where records are in billions and I need to make a query where GroupBy clause is needed. Table Schema: CREATE TABLE event ( eventId INTEGER PRIMARY KEY, eventTime INTEGER NOT NULL, sourceId INTEGER NOT NULL, …
Bilal Ahmed
  • 87
  • 1
  • 9
-1
votes
1 answer

SQL Vertica, Cast Long issue

I have the following SQL statement cast('long', ((substr(D.business_day, 1, 4)||substr(D.business_day, 6, 2))||substr(D.business_day, 9, 2))) AS bdate_id_yyyymmdd, I have tried to change the 'long' to integer to see if that would…
excelguy
  • 1,574
  • 6
  • 33
  • 67
-1
votes
2 answers

Sampling without replacement with a different sample size per group in SQL

Using the provided table I would like to randomly sample users per day. The number of users to be sampled is specified in the to_sample column and it is filled by another query. In this example I would like to sample 1 observation for day one and 2…
abu
  • 737
  • 5
  • 8
  • 19
-1
votes
1 answer

SQL - Searching for Duplications within a range of days

I love this site, and it's helped me tons but this is my first posted question. I searched and cannot find an answer to this specific question. Very new to SQL. Let's say I have a table that breaks down each line in an invoice where…
Kyngfish
  • 1
  • 2
-1
votes
2 answers

Fastest way to update database table

I have a table with around 100 million rows of data. I need to update some columns of this table (around 50) based on some values from another table. It has only one primary key column. No other constraints. something like this UPDATE …
-1
votes
1 answer

Select single column from a sub query returning multiple columns

This is my query select * from table where a in (select a,b,count(*) from table where b= 99991231 group by 1,2 having count(*) > 1) ; ERROR 4829: Subquery has too many columns I want to select those values of a where b's value is 99991231 and…
Swathi8038
  • 96
  • 2
  • 11