9

I'm getting "Read timed out" when running a query on a 1,3b row db.

It is not a particular advanced query that groups together hashtags in tweets:

SELECT case when match(hashtag, 
            '[Cc]orona.*|COVID.*|[Cc]ovid.*|[Cc]oVID_19.*|[Cc]orvid19.*|COVD19.*|CORONA.*|KILLTHEVI.*|SARSCoV.*|ChineseVi.*|WuhanVir.*|ChinaVir.*|[Vv]irus.*|
            [Qq]uarantine|[Pp]andemic.*|[Cc]linical[Tt]rial.*|FlattenTheCurve.*|SocialDistancing.*|StayHome.*|StayTheFHome.*|StayAtHome.*|stopthespread.*|
            SafeHands.*|WashYourHands.*|SelfIsolation.*')                           then 'COVID19' 
            when match(hashtag, '[Jj]anta[Cc]urfew.*|[Jj]anata[Cc]urfew.*')         then 'JantaCurfew'
            when match(hashtag, 'Bhula.*')                                          then 'Bhula'
            when match(hashtag, '[Ss]t[Pp]atrick.*|HappyStPatrick')                 then 'StPatricks day'
            when match(hashtag, '[Cc]hina.*')                                       then 'China'
            when match(hashtag, '[Ii]taly.*')                                       then 'Italy'
            when match(hashtag, '[Ii]ran.*')                                        then 'Iran'
            when match(hashtag, '[Ii]ndia.*')                                       then 'India'
            when match(hashtag, '[Hh]appy[Mm]others[Dd]ay.*|[Mm]others[Dd]ay.*')    then 'MothersDay'
            else hashtag END
            as Hashtag,
  SUM(CASE WHEN created >= '2020-05-14 00:00:00' AND created <= '2020-03-14 23:59:59' THEN 1 END) "May 14th'20",
  SUM(CASE WHEN created >= '2020-05-13 00:00:00' AND created <= '2020-03-13 23:59:59' THEN 1 END) "May 13th'20",
  SUM(CASE WHEN created >= '2020-05-12 00:00:00' AND created <= '2020-03-12 23:59:59' THEN 1 END) "May 12th'20"
FROM twitterDBhashtags
group by Hashtag 
order by "May 12th'20" DESC limit 20;

Clickhouse is running on a striped hdd and accessed through GB network.

How can the timeout, if that is the challenge, be changed to allow for more time?

I would very much want to be able to run multi minutes queries without getting the "Read timed out" message, if possible.

vladimir
  • 13,428
  • 2
  • 44
  • 70
questionmark
  • 147
  • 1
  • 2
  • 5

1 Answers1

26

CH jdbc driver has a socket_timeout = 30000 (30s) by default

Under the Advanced tab, you can configure advanced connections settings, > e.g., Character Coding.

Connection / Advanced properties / New property -> socket_timeout = 300000

Denny Crane
  • 11,574
  • 2
  • 19
  • 30
  • 5
    Perfect. I knew there was something I missed. Trees and the forest.. "Edit Connections" -> "Driver Properties" -> scroll down the list to "socket_timeout" :-) . Now it happily runs – questionmark Aug 28 '20 at 15:24
  • In the community edition Version 22.2.4.202211061524 Right click the connection -> Edit (F4) -> Driver properties tab -> socket_timeout – Ajay M Nov 11 '22 at 16:42
  • Nowadays the field is in camelCase -> socketTimeout – nashvent Feb 23 '23 at 14:42