- The Created At Column Contains The Timestamp Of Each Tweet The Row Is Referring To But Its Current Format Is Not Idea 1 (87.68 KiB) Viewed 10 times
The "created_at" column contains the timestamp of each tweet the row is referring to, but its current format is not idea
-
- Site Admin
- Posts: 899603
- Joined: Mon Aug 02, 2021 8:13 am
The "created_at" column contains the timestamp of each tweet the row is referring to, but its current format is not idea
The "created_at" column contains the timestamp of each tweet the row is referring to, but its current format is not ideal for comparing which one is earlier or older (why?). To change this, we are going to reformat the column (but still Unicode of length 30 after conversion). Write a function converting_timestamps (array) that converts the original timestamp format into a new format as follows: Current format : [day] [month] [day value] [hour]:[minute]:[second] [time zone difference] [year] New format : [year]-[month value]-[day value] [hour]: [minute]:[second] For example, a current format value: Tue Feb 04 17:04:01 +0000 2020 will be converted to: 2020-02-04 17:04:01 Note: The input to this function will be the "created_at" column. The return value should thus be in a form that can replace this column. For example: Test Result 2020-02-29 13:32:59 data = unstructured_to_structured (load_metrics ("covid_sentiment_metrics.csv"), [0, 1, 7, 8]) data[:]['created_at'] = converting_timestamps (data[:] ['created_at']) print (data[:]['created_at'][0]) data = unstructured_to_structured (load_metrics ("covid_sentiment_metrics.csv"), [0, 1, 7, 8]) <U19 data[:]['created_at'] converting_timestamps (data[:]['created_at']) print (data[:]['created_at' ][0].dtype) Answer: (penalty regime: 0, 0, 10, 20, ... %) 1 def converting_timestamp (array): 2 """timestamp"***** 3 monthVal = { 'Jan': '01', 'Feb': '02', 'Mar': '03', 'Apr': '04', 'May': '05', 'Jun': '06', 'Jul': '07', 'Aug': '08', 'Sep' : '09" parts = array.split(' ') 4 month = parts [1] monthValue = monthVal[month] 7 newDate= parts [5] ++ monthValue + - + parts [2] + + parts [3] 8 9 return newDate