Take-home Exercise 2:

Published

September 30, 2025

Modified

October 2, 2025

Important

This handout provides the context, the task, the expectation and the grading criteria of Take-home Exercise 2. Students must review and understand them before getting started with the take-home exercise.

Setting the Scene

Urban mobility refers to the movement of people and goods within and between cities, spanning all modes of transport such as walking, cycling, public transit, private vehicles, and freight. Driven by factors like population growth and climate change, urban mobility planning aims to create smart, efficient, and sustainable networks that reduce congestion and emissions while enhancing quality of life and economic vitality. Key developments include the rise of shared mobility services, autonomous vehicles, and integrated smart systems that prioritize user experience, safety, and inclusivity in urban environments.

As city-wide urban infrastructures such as buses, taxis, mass rapid transit, public utilities and roads become digital, the datasets obtained can be used as a framework for tracking movement patterns through space and time. This is especially true with the widespread deployment of pervasive computing technologies such as smart cards, GPS, and RFID in vehicles. For example, routes and ridership data were collected with the use of smart cards and Global Positioning System (GPS) devices available on the public buses. These large-scale movement datasets often contain patterns that reveal important characteristics of urban mobility. Identifying and analyzing such patterns can deepen our understanding of human mobility, supporting better urban management. For both public and private transport providers, these insights enable more informed decisions and improved service planning.

In real-world practices, the use of these massive locational aware data to gain better urban mobility patterns and trends, however, tend to be confined to simple tracking and mapping with GIS applications. This is mainly because conventional GIS tools lack robust geospatial statistics capabilities for analyzing and modeling spatial and spatio-temporal data.

Objectives

Local Measures of Spatial Autocorrelation (LMSA) are geospatial statisticals for identifying specific locations within a map that show significant spatial clustering of similar or dissimilar values. Unlike global measures that summarize the entire map, LMSA tests each location individually to determine whether its value, in relation to its neighbors, is more clustered or dispersed than expected by chance. This helps to find hotspots (high values clustered together), coldspots (low values clustered together), and outliers (high values surrounded by low values, or vice versa) by calculating a value for each individual location. The popular LMSA statistics include Local Moran’s I, Local Geary’s C, and the Getis-Ord Gi*.

Emerging Hot Spot Analysis (EHSA) is a spatio-temporal technique that complements LMSA when working with spatio-temporal data. It uses a space-time cube to identify and categorize clusters of high or low values (hot and cold spots) over time. By combining local Gi* statistics to find clusters with the Mann-Kendall test to detect trends, EHSA categorizes locations into trends such as “new,” “intensifying,” “diminishing,” or “sporadic” hot and cold spots, revealing how spatial patterns are evolving.

Both LMSA and EHSA hold tremendous potential to address complex urban mobility problems facing society. Using Passenger Volume by Origin Destination Bus Stops provided by LTA DataMall, in this exercise, you will explore two important geospatial statistical approaches for analysing urban mobility:

Local Measures of Spatial Autocorrelation (LMSA)

  • Learn how to apply Local Moran’s I, Local Geary’s C, and the Getis-Ord Gi* statistic.
  • Understand how LMSA differs from global measures by testing each location individually, revealing where values cluster together or stand apart.
  • Identify hotspots (areas of high values clustered together), cold spots (areas of low values clustered together), and spatial outliers (unusual locations where high values are surrounded by low values, or vice versa).

Emerging Hot Spot Analysis (EHSA)

  • Learn how EHSA complements LMSA by adding a temporal dimension.
  • Use the space-time cube framework to track how hot and cold spots evolve over time.
  • Apply EHSA categories (e.g., “new,” “intensifying,” “diminishing,” “sporadic”) to understand whether travel patterns are strengthening, weakening, or shifting.

In conclusion, the objectives of this hands-on exercises are:

  • To discover how bus mobility patterns differ across neighborhoods, providing deeper insights than global summary measures.
  • To detect emerging patterns in urban mobility, helping planners anticipate commuting demands and design more responsive bus services and policies.

The Task

The specific tasks of this take-home exercise are as follows:

Geospatial Data Science

  • Derive an analytical hexagon data of 375m (this distance is the perpendicular distance between the centre of the hexagon and its edges) to represent the traffic analysis zone (TAZ).

  • With reference to the time intervals provided in the table below, compute the passenger trips generated by origin at the hexagon level.

    Peak hour period Bus tap on time
    Weekday morning peak 6am to 9am
    Weekday afternoon peak 5pm to 8pm
    Weekend/holiday 11am to 8pm
  • Display the geographical distribution of the passenger trips by using appropriate geovisualisation methods,

  • Describe the spatial patterns revealed by the geovisualisation (not more than 200 words per visual).

Local Measures of Spatial Association (LMSA) Analysis

  • Compute LMSA statistic of the passengers trips generate by origin at hexagon level.
  • Display the LMSA maps of the passengers trips generate by origin at hexagon level. The maps should only display the significant (i.e. p-value < 0.05)
  • With reference to the analysis results, draw statistical conclusions (not more than 200 words per visual).

Emerging Hot Spot Analysis(EHSA)

With reference to the passenger trips by origin at the hexagon level for the three time intervals given above:

  • Perform Mann-Kendall Test by using the spatio-temporal local Gi* values,
  • Prepared EHSA maps of the Gi* values of the passenger trips by origin at the hexagon level. The maps should only display the significant (i.e. p-value < 0.05).
  • With reference to the EHSA maps and data visualisation prepared, describe the spatial patterns reveled. (not more than 250 words per cluster).

The Data

Apstial data

For the purpose of this take-home exercise, the latest month Passenger Volume by Origin Destination Bus Stops downloaded from LTA DataMall will be used.

Geospatial data

Two geospatial data will be used in this study, they are:

  • Bus Stop Location from LTA DataMall. It provides information about all the bus stops currently being serviced by buses, including the bus stop code (identifier) and location coordinates.
  • Master Plan 2019 Subzone Boundary (No Sea) from Singapore’s open data portal.

Grading Criteria

This exercise will be graded by using the following criteria:

  • Geospatial Data Wrangling (20 marks): You will be assessed on your ability to prepare analysis-ready data, including:

    • Correct import of tabular and geospatial data from multiple sources.
    • Appropriate data cleaning, transformation, and derivation of new variables.
    • Correct use of coordinate reference systems and spatial joins.
    • Use of tidy, efficient R code and clear explanations of purpose.
Warning

All data are like vast grassland full of land mines. Your job is to clear those mines and not to step on them.

  • Geospatial Analysis (30 marks): You will be assessed on your ability to apply spatial statistical methods rigorously, including:

    • Appropriate choice and correct application of spatial statistics (e.g., Local Moran’s I, Getis-Ord Gi*, EHSA).
    • Accurate interpretation of outputs, with evidence-based reasoning.
    • Demonstrating awareness of assumptions, spatial scales, and limitations of chosen methods.
    • Clear articulation of how results address the stated objectives.
  • Geovisualisation and geocommunication (20 marks): You will be assessed on your ability to communicate results through effective geovisualisation, including:

    • Use of clear, accurate, and professional map designs (appropriate symbology, color scales, legends, labels).
    • Selection of visual forms that best reveal spatial patterns and support decision-making.
    • Concise and insightful written commentary (≤200 words per visual) that explains findings in plain, non-technical language.
    • Effective integration of visuals into a coherent narrative.
  • Reproducibility (15 marks): You will be assessed on your ability to ensure that your analysis is fully reproducible, including:

    • Use of Quarto with code chunks that run end-to-end without modification.
    • Clear explanation of purpose for each step (not just code, but why it is done).
    • Logical organisation of workflow, with modular structure and meaningful sectioning.
    • Proper documentation of R packages, data sources, and dependencies to allow replication by others.
  • Bonus (15 marks): Optional extension tasks reward advanced technical work and reproducible outputs. Students may attempt any combination of the tasks listed below; points add up to a maximum of 15 marks. To be eligible for bonus, the core submission must pass the minimum standard for reproducibility (see Reproducibility criterion). All bonus work must be submitted with the main deliverable (no separate late submissions for bonus).

Tasks & points:

  • Advanced method or validation — 5 marks.
  • Interactive delivery (Shiny/Quarto) — 5 marks.
  • Extra data fusion / validation — 5 marks.

Grading Rubric

Criterion Weight Good and above (>= 80 marks) Satisfactory (≈70–79 marks) Needs Improvement (≤70%)
Geospatial Data Wrangling 20 Data imported, cleaned, and transformed flawlessly; CRS and joins handled correctly; derived variables accurate; workflow efficient and well-documented. Data mostly correct but with minor issues (occasional CRS mismatch, redundant steps, unclear documentation). Major errors (misaligned joins, wrong CRS, missing variables); workflow unclear or non-reproducible.
Geospatial Analysis 30 Spatial methods (LMSA, EHSA) correctly chosen, implemented, and justified; results accurately interpreted; assumptions/limitations acknowledged. Methods applied with minor errors or incomplete justification; interpretations generally correct but lack depth. Methods misapplied or inappropriate; results incorrectly interpreted; assumptions/limitations ignored.
Geovisualisation & Communication 20 Maps/visuals clear, professional, and effective (symbology, legends, colors appropriate); commentary concise, insightful, and business-friendly (≤200 words). Maps mostly correct but lacking polish (unclear legends, distracting design); commentary descriptive not analytical. Maps poorly designed or misleading; missing legends/labels; commentary absent, too brief, or irrelevant.
Reproducibility 15 Quarto runs end-to-end without modification; code modular and organized; explanations clear; packages, data, dependencies documented; workflow replicable. Quarto runs with minor edits/warnings; explanations uneven; workflow somewhat fragmented; limited documentation. Document does not run or produces errors; code without explanations; workflow disorganized; reproducibility absent.
Bonus: Advanced Work (per subtask) 15

Advanced Method/Validation – correct, well-explained.

Interactive Delivery – working Shiny/Quarto with README.

Extra Data Fusion – adds relevant dataset with interpretation.

• Method applied but shallow.

• Interactive works but incomplete instructions.

• Extra data added but weak interpretation.

• Not attempted or incorrect.

• Little/no added insight.

Submission Instructions

  • The write-up of the take-home exercise must be in Quarto html document format. You are required to publish the write-up on Netlify.
  • Zip the take-home exercise folder and upload it onto eLearn. If the size of the zip file is beyond the capacity of eLearn, you can upload it on SMU OneDrive and provide the download link on eLearn..

Due Date

23th October 2025 (Thursday), 11.59pm (midnight).

Learning from senior

You are advised to review these sample submissions prepared by your seniors.

Learning from IS415

  • KHANT MIN NAING: Very well done in all the five grading criteria especially the ability to provide a comprehensive overview of the analysis methods used and discussion on the analysis results.
  • MATTHEW HO YIWEN Able to provide a clear and comprehensive discussion on the geospatial data wrangling process and to communicate the analysis results by using appropriate geovisualisation and data visualisation methods.

Q & A

Please submit your questions or queries related to this take-home exercise on Piazza.

Peer Learning

Reference

Research articles