I’ve tested two new tools of Google’s ecosystem maybe They’re classified into data mashup and refining tools.
I’ve tried Google Refine and Google Fusion Tables. The first one is a fantastic clean data tool with a server-client architecture that -I’m surprised, runs in local mode. Obviously It’s a java application because the Mountain View’s boys want a multiplatform program.
Refine works with facets and filters: a facet is similar to dimension’s OLAP cubes or datawarehouses dimensions. With facets is possible to group records around all values types present in the column selected to work as facet.
More: to work with a concrete dataset, the filter tool makes the group of desired records by columns filtering step by step : the result dataset is the source for the next filter and at the end you get that the final result set is the same as a SQL select with a multiple WHERE condition.
The usability for beginners is great, because she have a intuitive and prox interface as well as examples as options menu: uppercase transforms, blank out cells, trim cell content…
column’s splits , string’s joins, and others utilities are available into Refine, with a visual display and with degrees of complexity: for complex facets and filters, the application has a GREL (and jython) tool to build them.
I think she is a very interesting tool to first stages of transform legacy information systems: It’s possible to work in group, it’s usable for experts and beginners and the client side application is a web browser. Simply clever.
The second application, Google fusion tables is a web dataTool to work with our data tables without any database managenent tool. GFT enables upload our data with table structure into a great user space (I believe the user size is 100MB, but I’m not sure) and process to display them in a google map with multiple options.
The best in this history is the possibility to cook our dataTables in the web: joins, merges, filters, aggregates and more, are availables for us to use with our data and those have shared with us. GFT has a large data shared collection with wich we work and apply the “fusion” operations to obtain new results and display it over a geolocated environment.
More also: the google toponims geobase enables data geolocalization by a names of places-based system. Then, is possible display those legacy data without knowing their coordinates (and their inclussion in aggregates of high level hierarchies). If your geolocated data system is a tree structured data system, congratulations! You can have a nice day The way of working in a legacy geolocate system is the refine step and after, the fusion final step for a good and inexpensive result.
PS: At end of March-2011, Google has released a new data tool for cloud working: Dataset upload for Public Data Explore . I will try and tell you something.