Friday, November 27, 2009

Working with big data

One of the main goals of the project is the ability to process large datasets (mainly raster and vector layers, but also database tables). In the way of this goal, we have found several problems in the base libraries (mainly GeoTools, Java Image I/O Ext and Sextante), which had problems when accessing big raster data. Therefore, a part of our work has focused in reinforcing these libraries.

Some of these problems have already been solved:
- The ASCII GRID driver of Java Image I/O Ext is not able to read large files
- Sextante-GeoTools bindings load the whole raster layer in memory before processing it
- Sextante-GeoTools bindings load the whole dbf table in memory before processing it
- Sextante-GeoTools bindings creates the whole dbf table in memory before writing it to disk

However, there are still some pending tasks, all of them are due to GeoTools and Image I/O limitations:
- Raster layers must completely be created in memory before being written to disk.
- No BigTiff support (TIFF files are limited to 4GB)
- No binary grid support

Currently, these remaining tasks are considered to be low priority tasks, but we hope to be able to solve them in future phases of the project. Collaborations are welcome!

No comments:

Post a Comment