0

I am using the APPROX_SUM query after I created a table of 5000 rows of random integers less than 1000.

It always results in an exception

Exactly one argument is expected.

But I am using only one column with only integers as described below. I was running shark-withinfo. Can someone give me a hint on how to solve the problem?

shark> DESCRIBE rand5000;                         
18/10/11 11:57:09 INFO shark.SharkCliDriver: Execution Mode: shark  
18/10/11 11:57:10 INFO ql.Driver: <PERFLOG method=Driver.run>  
18/10/11 11:57:10 INFO ql.Driver: <PERFLOG method=compile>  
18/10/11 11:57:10 INFO parse.ParseDriver: Parsing command: DESCRIBE rand5000  
18/10/11 11:57:10 INFO parse.ParseDriver: Parse Completed  
18/10/11 11:57:10 INFO parse.DDLSemanticAnalyzer: analyzeDescribeTable done  
18/10/11 11:57:10 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:col_name, type:string, comment:from deserializer), FieldSchema(name:data_type, type:string, comment:from deserializer), FieldSchema(name:comment, type:string, comment:from deserializer)], properties:null)  
18/10/11 11:57:10 INFO ql.Driver: </PERFLOG method=compile start=1539277030001 end=1539277030529 duration=528>  
18/10/11 11:57:10 INFO ql.Driver: <PERFLOG method=Driver.execute>  
18/10/11 11:57:10 INFO ql.Driver: Starting command: DESCRIBE rand5000  
18/10/11 11:57:10 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
18/10/11 11:57:10 INFO metastore.ObjectStore: ObjectStore, initialize called  
7.700: [GC (Metadata GC Threshold)  355352K->30896K(5024768K), 0.0217271 secs]  
7.722: [Full GC (Metadata GC Threshold)  30896K->20373K(5024768K), 0.0616700 secs]  
8.237: [GC (System.gc())  114782K->22795K(5024768K), 0.0052734 secs]  
8.243: [Full GC (System.gc())  22795K->11577K(5024768K), 0.1667079 secs]  
8.411: [GC (System.gc())  48292K->11712K(5024768K), 0.0013864 secs]  
8.412: [Full GC (System.gc())  11712K->10076K(5024768K), 0.0616413 secs]  
18/10/11 11:57:13 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"    
18/10/11 11:57:13 INFO metastore.ObjectStore: Initialized ObjectStore  
18/10/11 11:57:14 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=rand5000  
18/10/11 11:57:14 INFO hive.log: DDL: struct rand5000 { i32 numbers}  
18/10/11 11:57:14 INFO exec.DDLTask: DDLTask: got data for rand5000  
18/10/11 11:57:14 INFO exec.DDLTask: DDLTask: written data for rand5000  
18/10/11 11:57:14 INFO ql.Driver: </PERFLOG method=Driver.execute start=1539277030530 end=1539277034972 duration=4442>  
OK  
18/10/11 11:57:14 INFO ql.Driver: OK  
18/10/11 11:57:14 INFO ql.Driver: <PERFLOG method=releaseLocks>  
18/10/11 11:57:14 INFO ql.Driver: </PERFLOG method=releaseLocks start=1539277034972 end=1539277034972 duration=0>  
18/10/11 11:57:14 INFO ql.Driver: </PERFLOG method=Driver.run start=1539277030001 end=1539277034973 duration=4972>  
18/10/11 11:57:15 INFO mapred.FileInputFormat: Total input paths to process : 1  
numbers int   
Time taken: 5.078 seconds  
18/10/11 11:57:15 INFO CliDriver: Time taken: 5.078 seconds  
18/10/11 11:57:15 INFO ql.Driver: <PERFLOG method=releaseLocks>  
18/10/11 11:57:15 INFO ql.Driver: </PERFLOG method=releaseLocks start=1539277035078 end=1539277035078 duration=0>  
shark> SELECT APPROX_SUM(numbers) FROM rand5000;  
18/10/11 11:57:28 INFO shark.SharkCliDriver: Execution Mode: shark  
18/10/11 11:57:28 INFO ql.Driver: <PERFLOG method=Driver.run>  
18/10/11 11:57:28 INFO ql.Driver: <PERFLOG method=compile>  
18/10/11 11:57:28 INFO parse.ParseDriver: Parsing command: SELECT APPROX_SUM(numbers) FROM rand5000  
18/10/11 11:57:28 INFO parse.ParseDriver: Parse Completed  
18/10/11 11:57:28 INFO parse.SharkSemanticAnalyzer: Get metadata for source tables  
18/10/11 11:57:28 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=rand5000  
18/10/11 11:57:28 INFO hive.log: DDL: struct rand5000 { i32 numbers}  
18/10/11 11:57:28 INFO parse.SharkSemanticAnalyzer: Get metadata for subqueries  
18/10/11 11:57:28 INFO parse.SharkSemanticAnalyzer: Get metadata for destination tables  
18/10/11 11:57:28 INFO hive.log: DDL: struct rand5000 { i32 numbers}
FAILED: Error in semantic analysis: Exactly one argument is expected.    
18/10/11 11:57:28 ERROR shark.SharkDriver: FAILED: Error in semantic analysis: Exactly one argument is expected.  
org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: Exactly one argument is expected.
        at org.apache.hadoop.hive.ql.udf.approx.ApproxUDAFSum.getEvaluator(ApproxUDAFSum.java:60)
        at org.apache.hadoop.hive.ql.udf.generic.AbstractGenericUDAFResolver.getEvaluator(AbstractGenericUDAFResolver.java:47)
        at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getGenericUDAFEvaluator(FunctionRegistry.java:785)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getGenericUDAFEvaluator(SemanticAnalyzer.java:2464)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapGroupByOperator(SemanticAnalyzer.java:2904)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggr1MR(SemanticAnalyzer.java:3704)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:6183)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:6820)
        at shark.parse.SharkSemanticAnalyzer.analyzeInternal(SharkSemanticAnalyzer.scala:160)
        at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:244)
        at shark.SharkDriver.compile(SharkDriver.scala:194)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:895)
        at shark.SharkCliDriver.processCmd(SharkCliDriver.scala:294)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:341)
        at shark.SharkCliDriver$.main(SharkCliDriver.scala:203)
        at shark.SharkCliDriver.main(SharkCliDriver.scala)

18/10/11 11:57:28 INFO ql.Driver: </PERFLOG method=compile start=1539277048550 end=1539277048725 duration=175>
18/10/11 11:57:28 INFO ql.Driver: <PERFLOG method=releaseLocks>
18/10/11 11:57:28 INFO ql.Driver: </PERFLOG method=releaseLocks start=1539277048725 end=1539277048725 duration=0>
GhostCat
  • 137,827
  • 25
  • 176
  • 248
bbh
  • 1
  • Welcome to Stack Overflow! Other users marked your question for low quality and need for improvement. I re-worded/formatted your input to make it easier to read/understand. Please review my changes to ensure they reflect your intentions. But I think your question is still not answerable. **You** should [edit] your question now, to add missing details (see [mcve] ). Feel free to drop me a comment in case you have further questions or feedback for me. – GhostCat Oct 12 '18 at 03:07
  • I suggest you use a table with just a few rows, and then you add all your steps to your question. We can't tell you what you did wrong when you only **describe** what exactly you were doing. – GhostCat Oct 12 '18 at 03:08

0 Answers0