0

I have twitter datasets out of which have very few have geo-location tags (lat,long). I would like to find or estimate the locations of tweets as much as possible. So, I am just wondering, how can I do this using python? Any example!

So far, I have written the following codes:

import geopandas as gpd
from shapely.geometry import Point

gdf=gpd.GeoDataFrame(df2,crs=4326,geometry=gpd.points_from_xy(df2["long"],df2["lat"]))
gdf=gdf.drop(["lat","text","lang"],axis=1)

These codes showing the coordinates of those tweets that are already there but does not showing missing locations of tweets.

Ankita
  • 485
  • 5
  • 18
  • do you have any code samples or json format of twitter API responses ? – OzOm18 Sep 07 '21 at 14:48
  • I have datasets that I converted into data frames But not able to attach them here. I added sample of code – Ankita Sep 07 '21 at 14:54
  • obvious answer is cleanup data in `df2` first. use defaulting logic - for example for same IP address assume lat/lon will be most frequently used one. as with any DQ process like this it's important to maintain transparency and traceability. if you are familiar with financial services, effectively principles layed out in BCBC 239 – Rob Raymond Sep 07 '21 at 17:13
  • I already cleaned the data. The problem is, very few tweets have geolocation (1%). I am just wondering, is there is any way where we can create extra location inference that increases tweets locations size. – Ankita Sep 08 '21 at 07:32

0 Answers0