Internship

The LUE team at Utrecht University is welcoming students who want to do an internship (or thesis project) related to the LUE project. In the past years, students have been working on various topics, e.g.:

  • Parallelize a cost distance algorithm to determine the cost from each destination raster cell to the nearest source cell.
  • Implement a custom ONNX runtime to be able to integrate trained machine learning models in LUE models
  • Parallelize a kinematic wave algorithm to make computing (water) flows through an (hydrological) network faster.

LUE is a large suite of software with many different components that together make up the modelling framework. This means that there are many, many topics a student could work on. Below, we list some examples. Contact us if you are interested in an internship (or thesis project) with us, also if you are interested in another topic related to LUE.

Concepts

With LUE we aim to integrate concepts from agent-based modelling and field-based modelling into a single modelling framework. This is not an easy task, and quite some challenges in this area remain to be tackled.

Conceptual work can be high level theoretical work, but also more down to earth work, at the software level.

  • Develop / design an algebra for agent-based modelling which is comparatively easy to use as map algebra.
  • Integrate more feature types to the LUE data model, like lines, polygons, and discrete global grid systems.
  • Find a way to efficient support networks between objects to the LUE data model. This would allow LUE to represent group / individual relationships, family trees, road networks, etc.
  • Add support for spatial indices to the LUE data model. This would allow LUE to quickly find information based on geographical location. How to store this information in a compact manner? What are the implications for doing parallel I/O on very large collections of objects?

Algorithms

Generic algorithms are the core building blocks of the LUE framework. The more algorithms, the more domain experts can use LUE to represent the systems they want to simulate.

Algorithmic work can be split into conceptual and practical work. Conceptual work includes finding a potentially good way to divide as much of the work that needs to be done at runtime in smaller pieces which can be performed in parallel. Practical work includes implementing an algorithm, analysing the performance and scalability, an optimizing it.

This is just a small subset of the algorithms that would be useful to add to LUE.

  • K-means clustering
  • Viewshed analysis
  • Resampling
  • Fast Fourier Transform to make convolution operations with large kernel windows execute in less time
  • Spatial pattern analysis
  • Flow direction network creation. Better handling of flat areas and local depressions.
  • Parallel I/O. Increase the throughput to the LUE data model (or some other data model supporting parallel I/O, like netCDF-4 CF) from LUE computations on a platform with a parallel file system.

Data structures

LUE contains a few data structures that can be used to capture a simulated system's state. Some useful ones to add are listed below. Working on them includes thinking of a way to distribute the information. At runtime, LUE uses one or more OS processes to perform the work. These processes are distributed over the resources within a single computer or over multiple computers in a cluster. Data structures used to represent model state must be distributed as well. Under the hood, a LUE raster is a collection of partitions distributed over the processes. A LUE algorithm spawns a collection of tasks that translate these partitions into new partitions that are part of output rasters. New data structures are expected to work in a similar way.

Open questions:

  • How to balance the computational load when the features are spatially clustered?
  • How to handle mobility, affecting this load?

GUI

In the case of large datasets, standard tools for visually analysing model state become limited. What is needed is a new set of tools that are capable of using all hardware efficiently to visualize the data. The challenge of this is combining fast I/O, and fast algorithms to end up with something that can visualized. This is likely a representable subset of all the data. The requirements for a visualization tool are partly already met by the existing LUE software. Algorithms useful in the context of simulation modelling can also be used for pre-processing data for visualization, for example.

Relevant technologies may be Dear ImGui and Vulkan, both of which are already used in the lue_view application. These topics do not require knowledge about the geographical domain.

  • Visualize large rasters. Allow for fast zoom and pan.
  • Animate large rasters
  • On-the-fly computations of statistics
  • Support for exploratory data analysis. Link datasets by location in time and space.

Documentation

Good documentation (clear, complete) is very important, and it is currently lacking. If you are a master of the English language and keen on improving the current situation, then this topic is for you. LUE is the successor of the long-standing PCRaster software, which is documented extensively. Many LUE concepts, data structures, and operations are based on the PCRaster ones. This means that LUE documentation can be partly created by upgrading existing PCRaster documentation.

House style

If graphical design is your thing than we have a challenge for you because it is not exactly our thing. The LUE house style is currently non-existent, as you can probably tell. This means that designing a house style for the project is a greenfield project. You can have a big influence on the project's look and feel! This topic does not require knowledge about the geographical domain. The envisioned candidate has a background in graphical design.