-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create monthly database backup task #19
Comments
@fahadkirmani for this task, take a look at the CARTO.com SWL API, https://carto.com/developers/sql-api/ You can see how we're doing the incremental updates in https://github.com/GreenInfo-Network/nyc-crash-mapper-etl-script/blob/master/main.py and another set of tasks over at https://github.com/GreenInfo-Network/nyc-crash-mapper-etl-script/tree/master/fixtallies So for this we'd end up with a The CARTO credentials are in the env variables that you should be able to access in Heroku. Let's use this issue for discussion and findings, and then we can pull in my lead dev once you've gotten to the point of having questions. |
@danrademacher I have gone over the links you have mentioned and also configured the environment on my ubuntu 16.04 Xenial . When i tried running any query on carto https://carto.com/developers/sql-api/ |
You can find the CARTO API key under "config vars" here: |
I am trying this request https://{username}.carto.com/api/v2/sql?q={SQL statement}&api_key={api_key} to get the results but getting this message {"error":["Unauthorized"]} |
Those credentials should work since they are the ones running the daily script (which I just confirmed in Heroku logs is running fine), so let's have @gregallensworth take a look at this and see if he can get you going. Gregor, I have a new job to tie this time to, |
Working for me. I wrote a quick shell script which creates a URL to
The expected output is a row count (which will increase as more crashes are added each day):
@fahadkirmani If you run a simple SELECT query as I did, does it work or not? If it does, then the issue would be permissions for your given SQL statement. If it does not, then you may have copied the API key incorrectly. |
@gregallensworth Yes i got successful response of the request {"rows":[{"count":1548595}],"time":0.499,"fields":{"count":{"type":"number"}},"total_rows":1} |
@fahadkirmani have you been making progress here? Based on the last comment, it seemed like we resolved the API access issue. Have you been working on this since? |
@danrademacher Yes i got success in it and i did this task https://github.com/GreenInfo-Network/nyc-crash-mapper-etl-script/tree/master/fixtallies and successfully updated the carto data between this duration 2019-01-01 to 2019-03-28 . Also worked on it shell script so all this process can be run using shell script for my ease. |
Hi @danrademacher , There two socrata_id(4181883,4025853) are not present in CARTO but they are there in SODA unique_key(4181883,4025853). I have updated the records but I think they(scripts I have) only update the data on CARTO from SODA but do not insert new records in CARTO. 09/10/2018 --> 4181883 Can you check why some IDs are getting skipped as I don't see any information is missing for those two records in SODA . Also lon,lat information is present there in SODA for those two rows. |
The leading reason that a crash would not have been loaded into CARTO, is that some crashes are not logged into Socrata for weeks or even months after the fact. Original versions of the ETL script ran daily, and looked for crashes dated the prior day (e.g. today, it would query Crash 4181883 definitely fits into that profile. They waited 11 months to enter it, and we look back only 2 months.
Crash 4025853 they waited 16 months:
In order to address these, I wrote the |
Python script to do the following:
crashes_all_prod
with some name likecrashes_all_prod_achrive_YYYYMMDD
Then
┆Issue is synchronized with this Asana task
The text was updated successfully, but these errors were encountered: