|North Sea Checkpoint Data|
Whilst carrying out the various challenges, the North Sea Check Point (NSCP) project has become aware that there is no ‘data broker’ facility which assists in the valuation of data sets. Neither data portals nor independent literature give widespread information on the value of the data for particular uses. In most cases it is incumbent on the user to download the data and then make assessment as to its value. The evaluation for data being sourced during NSCP challenges is structured against six ‘value criteria’:
|Contribution - What impact the data have on solving the problem.||Fundamentally, the data must contain the required parameter or phenomena to be of value. This is clearer for single variables, but has more meaning when applied to groups of data such as total suspended matter, hydrodynamic conditions, rainfall etc. For example, total suspended matter alone may have less contribution to solving a problem than a combination of water quality and hydrodynamic parameters.|
|Location - Where the measurements have been taken and at what time.||The spatial and temporal distributions of the data are critical as most data are required for a particular site and/or time frame.|
|Commercial- What the data costs.||Any data will have to be selected within the constraints of the data cost with respect to the allocated budget. For end-customers, data costs need to be set against benefit realised. Previous studies have shown that organisations do not object to paying for data, but pricing needs to be clear so they can budget for it. Commercial terms are also a factor as this may dictate what can be done with the data.|
|Attributes - Fitness for purpose.||This covers a number of factors about the data such as accuracy, precision, and spatial and temporal resolution. In addition, it also embraces quality control parameters such as metadata and the traceability of processing applied to the data.|
|Delivery - Can the data be supplied in time?||Delivery is important in time critical applications. This is particularly the case in emergency operations such as monitoring oil spills and in applications where the data have a short shelf life e.g. weather forecasts. This may also encompass any continuity issues with the data, i.e. can the data be supplied on an on-going basis?|
|Usability - How easy is it to use the data?||This covers such factors as the ease of visual presentation or the ease of extraction to provide input to a numerical model or software package. Clearly, the demand will be greater for data that can be readily consumed by the customer.|
To ensure that this valuation is captured for future use, we are developing a data screening tool based loosely on that provided by 'TripAdvisor' for the travel industry and, as such, we have named it the 'Data Advisor'. TripAdvisor provides a valuation of hotels for various attributes, such as business trips, romantic destinations, family friendly etc. Crucially, these valuations are provided by the users – it is their perspective, given at a certain time and with their own criteria in mind. We hope that the Data Advisor, or the concept, could provide a data brokerage service which supports the oft-stated aim of 'collect once, use multiple times'. Our challenge valuations will be captured within the Data Advisor tool and published in a searchable form to serve as use case examples.
The currently developing interface will allow the user to do the following:
- Search for datasets associated to a particular challenge or challenges at a particular level of valuation. For example, 'Find all datasets considered suitable for challenge X'.
- Search for the challenges against which a dataset has been valued, for example, 'For dataset X, find all challenges where the dataset was considered'.
- View the scoring metrics for each dataset. This allows the user to view the value criteria the dataset passed or failed and the associated reason.
This concept represents the user’s perspective, a viewpoint often missed when datasets are offered by suppliers. The valuations are the perspectives of individual users and build to represent a history of snapshots of user experiences. As such, this idea represents an ideal opportunity to seed a critical mass of information that can be formulated to become an invaluable, sustainable tool to all potential future data users.