This week was oriented according to the timeline.
We had planned first to make the pipeline ready, and then to move forward with the rest of the procedures. In the first week of the coding period, I had extracted the required data from ElasticSearch and converted it to the default SCMS implementation's Airtable view. This week, I have randomly tagged the datasheet with five metrics and have linked the additional data back to ElasticSearch via a ‘study’ called ‘enrich_extra_data’.
To explain what this exactly means and looks like, I’ll explain in detail the steps involved.
The first step was to randomly tag the dataset by all possible combinations of the social currency parameters; Transparency, Utility, Consistency, Merit and Trust - in an Excel Sheet. Here, we have added another column in the excel sheet by the name of ‘scms_tags.
The second step is making a python script Excel2Json which could convert Excel to a specific type of JSON which will be used as an input to the study. You can find the JSON here.
Now, in step 3, we need to execute a study called ‘enrich_extra_data’ in grimoirelab-ELK.
Edit Setup.cfg according to the study, input the URL of JSON made above.
After successfully performing the study, we can see that ElasticSearch indexes have the extra
parameter of scms_tags in the dump. ?
The study appends an ‘extra’ to the field name, So the field name is ‘extra_scms_tags’.
The importance of the SCMS' Codex
After building out the data set and meeting with mentors on June 12th, I understood the importance of codex sheet in training the tag set up. The minutes of the meeting are here.
Defining the codex table helps to increase universality and decrease the subjectivity of the data. It also allows us to rely more on qualitative data rather than quantitative data.
After the training, we discussed the path for the next week. In the next week, I’ll be making a codex table which will contain the definitions of each trend observed and will contain the ‘when to-use’ and ‘when not-to-use’ cases.
Additionally, I should note that we found the limit in the number of records in Airtable, so we planned to shift the implementation to Google Sheets. In future blogs we'll be using google.
This week went off well, looking forwards to the next week! ?
Make sure you have a look at the project updates on Github #ria18405/GSoC.
All questions and comments are welcomed!
Stay tuned for more weekly updates. ?