This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Human Factors, is properly cited. The complete bibliographic information, a link to the original publication on http://humanfactors.jmir.org, as well as this copyright and license information must be included.
Mobile data collection systems are often difficult to use for nontechnical or novice users. This can be attributed to the fact that developers of such tools do not adequately involve end users in the design and development of product features and functions, which often creates interaction challenges.
The main objective of this study was to assess the guidelines for form design using high-fidelity prototypes developed based on end-user preferences. We also sought to investigate the association between the results from the System Usability Scale (SUS) and those from the Study Tailored Evaluation Questionnaire (STEQ) after the evaluation. In addition, we sought to recommend some practical guidelines for the implementation of the group testing approach particularly in low-resource settings during mobile form design.
We developed a Web-based high-fidelity prototype using Axure RP 8. A total of 30 research assistants (RAs) evaluated this prototype in March 2018 by completing the given tasks during 1 common session. An STEQ comprising 13 affirmative statements and the commonly used and validated SUS were administered to evaluate the usability and user experience after interaction with the prototype. The STEQ evaluation was summarized using frequencies in an Excel sheet while the SUS scores were calculated based on whether the statement was positive (user selection minus 1) or negative (5 minus user selection). These were summed up and the score contributions multiplied by 2.5 to give the overall form usability from each participant.
Of the RAs, 80% (24/30) appreciated the form progress indication, found the form navigation easy, and were satisfied with the error messages. The results gave a SUS average score of 70.4 (SD 11.7), which is above the recommended average SUS score of 68, meaning that the usability of the prototype was above average. The scores from the STEQ, on the other hand, indicated a 70% (21/30) level of agreement with the affirmative evaluation statements. The results from the 2 instruments indicated a fair level of user satisfaction and a strong positive association as shown by the Pearson correlation value of .623 (
A high-fidelity prototype was used to give the users experience with a product they would likely use in their work. Group testing was done because of scarcity of resources such as costs and time involved especially in low-income countries. If embraced, this approach could help assess user needs of the diverse user groups. With proper preparation and the right infrastructure at an affordable cost, usability testing could lead to the development of highly usable forms. The study thus makes recommendations on the practical guidelines for the implementation of the group testing approach particularly in low-resource settings during mobile form design.
Usability implementation in many design scenarios, even in user-centered designs (UCDs), is still unsatisfactory [
The mobile user interface designs are usually based on the desktop paradigm whose designs do not fully fit the mobile context [
Usability is mainly concerned with the exhibited design features of interactive products in relation to how easy the user interface is to use [
User testing is one of the usability evaluation methods where the assessment of the usability of a system is determined by observing the users working with that system [
This study therefore assesses a set of design guidelines using the group testing approach and records the end users’ experience after interacting with the high-fidelity prototype. It also recommends some practical ways of implementing group testing during mobile form design, particularly in low-resource settings. To achieve this, a high-fidelity prototype was developed based on the end users’ design preferences and evaluated by the research assistants (RAs) for usability and UX after interaction using SUS and STEQ. We report the level of satisfaction and the features from the prototype the RAs are satisfied with.
The study participants were 30 RAs, and all of them were collecting data on a maternal and child health project (the Survival Pluss project) in northern Uganda, which is funded by the Norwegian Programme for Capacity Development in Higher Education and Research for Development (NORHED) [
A Web-based high-fidelity prototype for MEDCFs was developed between January and February 2018. This prototype was meant to demonstrate the RAs’ design preferences having collected them earlier using a mid-fidelity prototype [
The prototype had 3 main sections structured based on the project’s content. These consisted of the demographic section where participants were required to fill the participant ID, interviewer name, and interviewer telephone number. Section I had list pickers and section II showed different table designs capturing a child’s sickness record. We explained to the RAs the potential value of the user testing exercise before giving them access to the prototype and to the tasks they were supposed to do. A summary of the entered data on the child sickness was available for the users to crosscheck and
The group testing exercise was conducted in February 2018 in Lira, Uganda. The RAs were required to complete some tasks (
The prototype evaluation happened immediately after the group testing exercise. This was an ex-post naturalistic evaluation because we were evaluating an instantiated artifact in its real environment, that is, with the actual users and in the real setting [
A total of 2 instruments were used to evaluate the prototype usability, one was the SUS, a standardized questionnaire, and the other was STEQ. By combining the two, we expected to gain more detailed insight and also to test our generated questionnaire against the standardized one. These 2 posttest questionnaires were administered after the participants had completed the tasks in a bid to show how users perceived the usability of the data collection forms [
The STEQ comprised 13 statements and was developed based on the literature with a purpose of making an alternative instrument, other than the SUS. The statements were based on features such as form progress, simplicity in use, error correction and recovery, and visual appeal, among others. The RAs were required to indicate their level of agreement with the evaluation statements by selecting options, which included
The SUS is a balanced questionnaire that is used to evaluate the usability of a system and comprises 10 alternating positive and negative statements [
The 13 statements in the tailormade evaluation questionnaire and the number of respondents (n=30) in each category from
Evaluation statement | Strongly disagree, n (%) | Disagree, n (%) | Neutral, n (%) | Agree, n (%) | Somewhat agree, n (%) | Don’t agree, n (%) | Total (N)a |
The form informs about its progress during interaction | 0 (0) | 0 (0) | 2 (6) | 8 (27) | 20 (67) | 0 (0) | 30 |
The information, for example, onscreen messages provided in this form were clear | 1(3) | 0 (0) | 3 (11) | 4 (14) | 18 (64) | 2 (7) | 28 |
It was easy to move from one page to another | 3 (10) | 2 (6) | 1 (3) | 8 (27) | 15 (50) | 1 (3) | 30 |
The overall organization of the form is easy to understand | 1 (3) | 0 (0) | 2 (6) | 13 (43) | 12 (40) | 1 (3) | 30 |
I knew at every input what rule I had to stick to (possible answer length, date format, etc) | 2 (6) | 3 (10) | 7 (23) | 5 (17) | 13 (43) | 0 (0) | 30 |
Reading of characters on the form screen is easy | 1 (0) | 3 (10) | 9 (30) | 17 (57) | 0 (0) | 0 (0) | 30 |
The form gave error messages that clearly told me how to fix the problems | 3 (10) | 1 (3) | 1 (3) | 2 (6) | 21 (70) | 2 (6) | 30 |
I was able to fill in the form quickly | 2 (6) | 4 (13) | 3 (10) | 8 (27) | 13 (43) | 1 (3) | 30 |
It was simple to fill this form | 1 (3) | 1 (3) | 5 (17) | 10 (33) | 13 (43) | 0 (0) | 30 |
Whenever I made a mistake when filling the form I could recover easily and quickly | 0 (0) | 1 (3) | 2 (6) | 5 (17) | 21 (70) | 1 (3) | 30 |
This form is visually appealing | 0 (0) | 2 (6) | 6 (20) | 10 (33) | 10 (33) | 2 (6) | 30 |
Overall, the form is easy to use | 1 (3) | 2 (6) | 1 (3) | 8 (27) | 17 (57) | 1 (3) | 30 |
Overall, I am satisfied with this form | 0 (0) | 0 (0) | 7 (21) | 8 (27) | 14 (41) | 1 (3) | 30 |
aSome respondents did not reply to all statements.
Results from the 2 instruments were compared. Previous studies have shown that irrespective of the questionnaires used being balanced or affirmative, the scores from the 2 questionnaires are likely to be similar [
This section presents the results after evaluation of the high-fidelity prototype using the tailor-made evaluation questionnaire and the SUS.
Of the data RAs, 80% (24/30)
However, more than 23% (7/30) of the participants
The individual SUSs ranged from 50 to 90 (
We plotted a graph to compare the association between the time it took to complete the form and the SUS scores (
Using these instruments concurrently turned out to be important because we were able to test for both usability and UX using the 2 instruments. In this study, the SUS is meant to measure usability, whereas the evaluation questionnaire is more detailed and meant to capture more of the UX after including the new design preferences.
The participants with the lowest SUS scores all found that the form was not simple to fill, easy to use, and were also not satisfied with it as depicted in the STEQ. These results could be attributed to the fact that there was a general comparison between the forms they had been using (ODK) and the high-fidelity prototype. It felt that the prototype was limiting their usage because due to missing functionality they could not freely do what they were used to doing with ODK. In general, the results from these 2 instruments are proof that the 2 evaluation methods or instruments are meant to complement each other and not to compete against each other [
The percentage of participants who agreed, disagreed or were neutral to the evaluation statements.
Results from the research assistants’ (RAs) evaluation using the System Usability Scale (n=30).
System Usability Scale compared with form completion time (minutes).
System Usability Scale (SUS) score compared with the Study Tailored Evaluation Questionnaire (STEQ) score. RA: research assistant.
We also note that the results for our generated affirmative STEQ do not depict any acquiescence bias because there were variations in the number of participants who
Our findings from the STEQ indicated that about 70% of the responses were
We used 30 participants in this study, contrary to the recommended 5 by some researchers. The justification of the number of use testers varies and is usually linked to the benefit per cost ratio [
Usability is not an absolute concept, but is relative, dependent on the task and the user [
Prototype evaluation as a means of usability testing may not necessarily identify comprehensively all the design problems in the prototype [
Metrics from posttest evaluations do not indicate why users struggle with any design and also do not provide insight on how the design can be improved because their main focus is on tracking how users feel about using a given product [
It is important to note that the SUS questionnaire was given after the first evaluation questionnaire, when some of the participants were probably tired and had lost their concentration, which may have had an influence on the SUS score. It was evident in some questionnaires that the users did not give much thought to what they were evaluating but ticked the same score across all the statements, for example, 1 participant who scored 50 selected
It was not possible to attach the users’ experience to their individual scores, because we collected the demographics data during the evaluation of the mid-fidelity prototype [
The results also indicate that the participants were not satisfied with the size of the screen characters and visual appeal. One would argue that the phone had a small screen size as in some cases, one had to scroll up and down several times on the same page to fill up the content on that screen. This could have had an impact on the scores from the RAs and the subsequent results.
A reasonable amount of time was spent trying to secure an internet connection, and on getting it, the internet speed was rather slow hence affecting the prototype loading time. As a result, the participants had to work in shifts because the internet could support 5 people at a go, meaning that some of the participants had to wait for longer hours before they could finally begin the exercise. Second, Survival Pluss project has a follow-up component of their recruited mothers, and some of these RAs had prior appointments to meet these mothers at the time when we were carrying out the evaluation. This also prolonged the time taken to carry out the evaluation because some of the RAs were not available on particular days or particular times.
Tailoring OSS solutions to user-specific needs and preferences at reasonable costs is worth the effort. We thus recommend that data collectors worldwide are involved in form design and evaluation as early involvement could also help understand the potential of the group, their preferences, and the group’s appropriate design solutions.
It is also important to consider the infrastructure and the user groups in such group testing activities, for example in this case, it would be advisable to have the prototype accessible in an offline state especially in areas where internet accessibility is a challenge.
It is not always feasible for software developers to include more resource-demanding features such as rich graphics, and perhaps some elements of gamification, but it is important to note that the RAs will always have some expectations that are worth exploring and considering.
Evaluating user design preferences to determine the UX using the group testing approach is not a common approach in the development of mobile data collection forms, and yet this could be one way of tailoring design to the user needs so as to cater for the diversity in context and user groups especially in rural Africa [
Screenshots showing the high-fidelity prototype.
Tasks carried out during interaction with the prototype.
design science research
mobile electronic data collection form
Norwegian Programme for Capacity Development in Higher Education and Research for Development
Open Data Kit
open-source software
research assistant
Study Tailored Evaluation Questionnaire
System Usability Scale
user-centered design
user experience
This work was funded by the Norwegian Agency for Development Cooperation (Norad) through the NORHED-funded HI-TRAIN project. However, the program had no role in determining the study design, data collection, and analysis, or in the interpretation of results and writing of the paper.
AM wrote the protocol and participated in data collection and analysis. TT participated in data collection. AB participated in data collection and analysis. All authors participated in the preparation of paper and approval of its final copy.
None declared.