DRE Data Analysis App 2021

<- Back to Initiatives       Project Documents - >

Project Summary

Need: The DRE is looking to improve efficiency and eliminate siloing for the analysis and reporting of the MOE's tests and assessments.
Purpose: To develop tools to automate data cleaning and analysis and to centralize storage of data and reports for the MOE's Palau English Reading Assessment (PERA).
Developer: Ms. Yaxuan (Rebecca) Li, TCinGC 2021 student consultant.
Local Partner: Mr. Keizy Shiro, DRE Assessment Specialist.
Status: Just started on June 3 2021.
Next: Expected to end about August 10 2021.


In general the MOE has issues with data siloing and lack of formal processes for data storage and analysis. This results in duplication of effort, long process times, and inability to audit and verify reports. While efficiency studies have not been conducted, in at least two individual cases, process times has been reduced from several days to two hours after implementing centralized storage and some automation.

Using the Division of Research and Evaluation (DRE) as an example, the graphic below illustrates the current state of data analysis, and shows the role of the silo in the process.

The desired workflow is illustrated in the graphic below. It sets up a centralized area with automation tools and central storage. The tools handle cleaning, uploads to storage, downloads from storage, analysis, and report generation. The desired workflow places automation tools at certain points in the process to take care of repetetive tasks, store data, move data where needed, and eliminate silos by provide access to the latest and cleanest data to everyone (obviously security will be implemented to allow only authorized access).

Note that DRE is missing from the graphic. In the current scenario, data ended up with Keizy at DRE because he was assigned to analyse the report. Hence the data is in Keizy's silo in the DRE. In the desired scenario, the data goes to the most appropriate place for storage. The owner of the data, BCI, can have access, and so can other people who need to perform analysis on the data or perform and audit of a report.

Please note that this is a more lower scale treatment of the situation that was addressed by the "Data Analysis Stack 2020" project. That project looked at the whole process from data source to the report. It was not adopted after its completion. So this project can help out while DRE assesses its options regarding its data analysis process.

Project Overview

Yaxuan Ms. Yaxuan (Rebecca) Li is the developer for this project. Ms. Li is a rising junior at Carnegie Mellon University majoring in Information Systems and a minor in Business Analytics and Optimization and Human Computer Interaction. She is from Beijing, China and loves traveling around the world. She is able to speak four different languages, including Mandarin, English, Japanese, and French. She is interested in developing innovative ideas and talking with creative intellectuals. Ms. Li's Resume.
Keizy Mr. Keizy U. Shiro is the local counterpart to Ms. Li on the project. Mr. Shiro is originally from the beautiful state of Ngaraard, situated to the north-east of Babeldaob Island in the Republic of Palau. Mr. Shiro has over 18 years of professional work experience, which began with his employment career with Palau High School as a classroom teacher in 2000. He last served as Palau High School Registrar from 2004 to 2007. Mr. Shiro is currently working as the Testing Coordinator at the Division of Research and Evaluation within Palau’s Ministry of Education.

The Palau English Reading Assessment (PERA) workflow was selected to the target of the project. The two will work to create and establish a PERA workflow that is more automated and more centralized. The automation improves efficiency. The centralization reduces duplication of effort and eliminates data and process silos.

The basic tasks of the workflow are: clean the data, store the data, grab the data from storage, analyse the data, and generate reports. The project will build tools to automate these tasks as much as possible and to facilitate the workflow, ie., transferring the data between storage and users and tools.

Project Detail

Weeks 1 and 2: Orientation, Context Review and Analysis, Project Selection.

  • Completed, June 3-17, 2021.
  • Draft Work Proposal is on file.

Week 3: Research and finalize how the project will be done.

  • Completed, June 18-24: The data cleaning tool will be built using MS Excel and VBA; The data upload to MySQL tool will be built using MS Excel and VBA (since this is the location of the data at this stage).
  • June 21-25 Goal 1 – Build the data cleaning algorithm
    • Outcomes status: DONE
      1. Progress regarding activities planned in the prior status report & outputs:
        • Discussed with Edwel to know if the idea of taking exams using tablets is possible [DONE]
        • Learned Excel VBA and coded the basic requirements of the cleaning algorithm [DONE]
        • Meet with Professor Poepping and Edwel and Keizy to discuss the remaining work for the following 7-8 weeks [DONE]
        • Meet with Edwel on Friday to update the weekly status [DONE]
      2. Activities planned to be completed next week (be sure to include important meetings):
        • Meet with Edwel to update the data cleaning algorithm [PENDING]
        • Finish writing data cleaning algorithm [PENDING]
        • Meet with Edwel on Friday to update the weekly status [PENDING]
  • June 21-25 Goal 2 – Finish the data storing and downloading process using ODB
    • Outcomes status: PENDING
  • June 21-25 Goal 3 – Compare the built-in graph generating functions in excel and R
    • Outcomes status: PENDING

Week 4: Complete the first version of the cleaning and upload tools.

  • In process, June 25 - July 1

July 9, 2021 0800-0800: MVP Meeting on Zoom

  • Ms. Li, Professors Peopping and Mertz, Mr. Ongrung
  • Ms. Li has successfully demonstrated a viable MS Excel data cleaning tool
  • Ms. Li is spending the appropriate time with Mr. Keizy Shiro to understand the PERA process
  • Ms. Li's progress is good; there is enough time to demonstrate an analysis tool and then try out the PERA process with the tools.

Weeks 5-7: Complete the basic data cleaning tool

  • The basic data cleaning tool is complete. It does the basic cleaning and exports data to tsv file so it can be uploaded to the central storage.
  • After several meetings with Keizy, the analysis to be automated are:
    • For item analysis for each test, a table of frequency by test item options will be generated.
    • Descriptive statistics for each test will be generated.
    • Pass/Fail statistics for each test will be generated.
    • Proficiency levels statistics will be generated.
  • The basic analysis tool has been completed. It downloads the score data from central storage, generates the four kinds of statistics tables for each test, and exports the tables to tsv files that could be uploaded to central storage.

Week 8: Refine code, generate the graphs for each table

  • Implemented the code to automate the opening files from the database [DONE]
  • Implemented the code to automate the data analysis for eight grades for a given year [DONE]
  • Implemented the code to save three tables into tsv forms [DONE]
  • Meet with Edwel on Friday to update the weekly status [DONE]
  • Generate the graphs for each table [PENDING]
  • Refine the code for four tables for analysis [PENDING]

Week 9: Generate the graphs for each table

  • Implemented the code to automate the required graph for descriptive statistics [DONE]
  • Implemented the code to automate the pie charts and bar chart for pass and failed worksheet [DONE]
  • Implemented the code to automate the bar chart of proficiency level for each grade level [DONE]
  • Meet with Edwel on Friday to update the weekly status [DONE]

Week 10: Wrap up the project

  • Implement the code for the automated comparison part for pre and post PERA test [DONE]
  • Implemented the code to automate charting part of average score comparison and enable the t-test [DONE]
  • Meet with Edwel on Friday to update the weekly status [DONE]
  • Implement the code to generate report for a specific school [DONE]
  • Implement the code to generate report for all the schools based on a given school year [DONE]
  • Refine the user interface and optimize the code [DONE]
  • Finish the report for PERA 2021 [DONE]