Friday, November 27, 2009

Working with big data

One of the main goals of the project is the ability to process large datasets (mainly raster and vector layers, but also database tables). In the way of this goal, we have found several problems in the base libraries (mainly GeoTools, Java Image I/O Ext and Sextante), which had problems when accessing big raster data. Therefore, a part of our work has focused in reinforcing these libraries.

Some of these problems have already been solved:
- The ASCII GRID driver of Java Image I/O Ext is not able to read large files
- Sextante-GeoTools bindings load the whole raster layer in memory before processing it
- Sextante-GeoTools bindings load the whole dbf table in memory before processing it
- Sextante-GeoTools bindings creates the whole dbf table in memory before writing it to disk

However, there are still some pending tasks, all of them are due to GeoTools and Image I/O limitations:
- Raster layers must completely be created in memory before being written to disk.
- No BigTiff support (TIFF files are limited to 4GB)
- No binary grid support

Currently, these remaining tasks are considered to be low priority tasks, but we hope to be able to solve them in future phases of the project. Collaborations are welcome!

Tuesday, November 24, 2009


Most of the data that ETC-LUSI has to process or create in its daily work, are TIFF files using EPSG:3035 projection (Lambert Azimutal Equal Area projection using the ETRS89 datum).

However, we've discovered that ArcGIS (at least on 9.3 version) is not able to properly encode EPSG:3035 using standard GeoTiff tags, so it uses an external auxiliary file (using a proprietary format) to store the spatial reference information.

GeoTools is not able to read these proprietary files, and therefore it refuses to read these (false Geo-)TIFF files. Fortunately, the library can still be convinced to read them by using the DEFAULT_COORDINATE_REFERENCE_SYSTEM Hint during GeoTiff reader creation.

For the moment, we can live with this workaround, but we would be really pleased to see ArcGIS generating correct standard GeoTIFF files for EPSG:3035 projection, as is the reference projection at the European Environment Agency.