Questions tagged [vertica]

Vertica is an MPP column-store analytics database that runs on clusters of Linux servers or in the cloud. The original configuration (Enterprise mode) distributes data and query execution among nodes (typically with duplication). A newer configuration (Eon mode) separates compute from storage and relies on S3 for storage.

Vertica is a columnar database designed for data warehousing. Vertica is based on C-Store, which was an academic project involving academic and entrepreneur Michael Stonebraker (a co-founder of Vertica, Ingres, VoltDB) and colleagues from Brandeis University, Brown University, MIT, and University of Massachusetts Boston.

As part of the C-store architecture table data is stored on disk in columns rather than in rows. Columns are stored independently of other columns and allow for increased compression on storage and decreased I/O during query operations.

Vertica has advanced analytical and machine learning functions built-in.

Vertica is designed to run on clusters of Linux nodes. A Vertica cluster runs in a MPP (Massive Parallel Processing) fashion, and it is designed to scale linearly by adding nodes to the cluster.

Vertica clusters can be deployed on premise (in Enterprise Mode) or in the cloud (in either Enterprise Mode or Eon Mode). Enterprise Mode consists of Vertica nodes that share compute and storage resources. Eon Mode is a cloud based mode in which there is a separation of compute nodes (AWS EC2 instances) and storage (S3). This separation of compute and storage allows for rapid scalability (adding or removing compute nodes) without the need to rebalance data across the nodes in the cluster.

Veritca was purchased by HP in 2011, and then acquired by Micro Focus as part of the HPE merger in 2017. As such, Vertica is sometimes referred to in old documentation and community posts as HP Vertica or HPE Vertica.

Vertica comes with a command line utility called vsql for connecting to the database.

1420 questions
6
votes
4 answers

Create a variable in Vertica

I am transitioning from SQL Server to Vertica. Is there any comparable way to create a variable? Usually I do something like: Define @myVariable int Set @myVariable = select MAX(Field1) from myTable
ScottieB
  • 3,958
  • 6
  • 42
  • 60
6
votes
2 answers

JOIN in Vertica fails with "inner partition did not fit in memory"

I have a problem with a big query from ten joined tables. I'm migrating data from a wide fact table (f1) into a star schema. I begin by populating the dimension tables from f1 and then I populate the new fact table (f2) with a join to the dimension…
user997904
  • 343
  • 3
  • 5
6
votes
10 answers

SQL issue - calculate max days sequence

There is a table with visits data: uid (INT) | created_at (DATETIME) I want to find how many days in a row a user has visited our app. So for instance: SELECT DISTINCT DATE(created_at) AS d FROM visits WHERE uid = 123 will return: d …
deadrunk
  • 13,861
  • 4
  • 29
  • 29
5
votes
5 answers

How to provide Vertica user with read-only access to certain specified system tables?

We're looking to set up a user in our Vertica database that can see certain system tables, (projections, projection_storage and views), but we don't want this user to be a dbadmin, because we don't want them to have write privileges on these tables.…
BVH
  • 184
  • 2
  • 5
5
votes
9 answers

Why do Column oriented databases such as Vertica/InfoBright/GreenPlum make a fuss of Hadoop?

What is the point in feeding an Hadoop cluster and using that cluster to feed data into a Vertica/InfoBright datawarehouse ? All thse vendor keep saying "we can connect with Hadoop", but I don't understand what's the point. What is the interest of…
SCO
  • 1,832
  • 1
  • 24
  • 45
5
votes
3 answers

How do I display the query time when a query completes in Vertica?

When using vsql, I would like to see how long a query took to run once it completes. For example when i run: select count(distinct key) from schema.table; I would like to see an output like: 5678 (1 row) total query time: 55 seconds. If this is not…
Ilya Kaplun
  • 51
  • 1
  • 5
5
votes
3 answers

dBeaver doesn't show me all the results from a Vertica table

I am trying to get a list of users, who submitted a specific event in a website, however when I run the query, I am not getting the full result set - for example, I found 2 users who had the event (and I used the same conditions), but are not in the…
Teodora Georgieva
  • 145
  • 1
  • 1
  • 7
5
votes
1 answer

Query Parquet data through Vertica (Vertica Hadoop Integration)

So I have a Hadoop cluster with three nodes. Vertica is co-located on cluster. There are Parquet files (partitioned by Hive) on HDFS. My goal is to query those files using Vertica. Right now what I did is using HDFS Connector, basically create an…
Jesse
  • 174
  • 12
5
votes
2 answers

How can see the SQL statements that SPARK sends to my database?

I have a spark cluster and a vertica database. I use spark.read.jdbc( # etc to load Spark dataframes into the cluster. When I do a certain groupby function df2 = df.groupby('factor').agg(F.stddev('sum(PnL)')) df2.show() I then get a vertica…
ThatDataGuy
  • 1,969
  • 2
  • 17
  • 43
5
votes
2 answers

Connect to Vertica from Datagrip

I'm using DataGrip and I'm kind of new in it. There is a case for me to connect to Vertica DB. As far as I know, there is no native provided driver for connection to that type of databases. What steps should I take to connect to it? Is there some…
Keithx
  • 2,994
  • 15
  • 42
  • 71
5
votes
2 answers

Compute ratio of group sizes using SQL

Consider a simple group by query: select foo, count(*) from mytable where bar=10 group by foo This returns a table that has the following form: foo | count ----+------ a | 100 b | 200 c | 300 My goal is to get, using a single query the…
Dror
  • 12,174
  • 21
  • 90
  • 160
5
votes
1 answer

Display vertica records vertically in vsql

I am look for a command in vsql which is equivalent of \g in mysql console, because I need display the result set vertically, looking like this:
Jade Tang
  • 321
  • 4
  • 23
5
votes
2 answers

SQL Runnnig the same Query multiple times

I have a query that gives me the COUNT between two dates, start_date & end_date and is grouped by various columns. Is there any way I can get the COUNT for each day ? As in say start_date is date1 and end_date is date5, so I need to run the query…
Neha
  • 233
  • 1
  • 3
  • 11
5
votes
1 answer

how to get the first (or any single) value in GROUP BY without ARRAY_AGG?

I'm migrating some SQL from PostgreSQL 9.2 to Vertica 7.0, and I could use some help replacing postgres's cool array_agg feature with something that Vertica (and possibly other RDBMS) supports, such as partitions and window functions. I'm new to…
Matthew Cornell
  • 4,114
  • 3
  • 27
  • 40
5
votes
4 answers

Concat GROUP BY in Vertica SQL

I need to get a comma separated list of ids as a field for a messy third party api :s This is a simplified version of what I am trying to achieve. | id | name | |====|======| | 01 | greg | | 02 | paul | | 03 | greg | | 04 | greg | | 05 | paul…
Gregology
  • 1,625
  • 2
  • 18
  • 32
1 2
3
94 95