Ecology Data Ecosystem case study
Objective? E.g. “to build a national data integration and sharing capability that enables scientists, ecologists, and all groups with an interest in new Zealand's ecosystem to be able to collaborate, coordinate, build intellectual and social capital that helps New Zealanders support the environment”. Or “To use data to support the effort to make NZ predator Free in 2050” In depth exploration of the value proposition, (sub-)interest groups, challenges, concerns, existing approaches to data integration and sharing.
Reverse brief for the New Zealand bio heritage and predator free communities.
The aim of this reverse brief is to outline the feasibility, desirability and keys to success for a data Commons-based solution to meet the data sharing needs of New Zealand's community of common interest in ecology, bio-heritage and Predator Free New Zealand. Note that this is not a formal user requirements. The aim here is to introduce the value proposition and core ideas, and test these with the community of interest in order to refine the reverse brief, and for that community to decide its level of interest in pursuing a data Commons-based solution to their data sharing needs.
Outline: To do this I will:
Propose a definition of the community of interest and their needs.
Review the existing state of datasharing within this community, and alternative solutions to a data commons
Introduce a data Commons-based approach, what this is, and the keys to its success
Review, at a high level, the advantages and challenges of adopting a data Commons based approach to data sharing and integration for this community *
- Define the core interaction: building a network effect, developing and maintaining the user community *
- Review three potential starting services *
- Explore how the data Commons meets the community's need for motivation, orientation, and efficiency. *
- Describe the benefits and opportunities of providing an open market for ecology data. This is done by providing a number of potential services and value adds spinning off the Commons. *
- Consider funding, kickstarting and scaling. How does an ecology data Commons fund itself? *
- Consider constituency needs: the interests and design briefs of particular constituencies, for example scientists' publishing needs, farmers' needs, et cetera. *
- Engage with community curation questions; how does the community of interest government itself? How open, how closed? Can anyone produce data? How is data quality regulated?
The New Zealand ecosystem community of interest.
The New Zealand Biological Heritage Science Challenge and the Next Foundation have co-funded the data Commons work due to an interest in investigating the potential for improved datasharing within this community to improve the coordination, mobilisation and effectiveness of sustaining and improving New Zealand's ecosystem.
There are two principle areas of endeavour. At the superset level is the interest of the scientific and conservation community in monitoring the health of New Zealand's bio heritage. Within this is a community of interest in Predator Free New Zealand. We think that there is enough shared overlapping interests such that we will treat these both as the same case study. This is because much of the data will be of interest to both parties if it can be shared and is likely to be collected and used for a range of ecology, science and conservation practice.
The producers and consumers of bio heritage and predator free data will include:
- Scientists, conservationists, philanthropists, the Department of Conservation, the Ministry for the Environment, citizen volunteers and NGO groups with a professional interest in ecology and pest eradication.
- Recreational uses of the natural environment such as tourists, trampers, hunters, watersports enthusiasts.
- The primary sector, as it interacts with ecology, bio heritage, pest eradication, and farming and agriculture interests.
The community of interest may overlap with bio security and border security interests to detect and stop invasive pests.
The particular focus of this reverse brief is to describe the needs of the professional groups who are interested in bio heritage and Predator Free New Zealand (the first grouping).
The value proposition for the Bio Heritage Science Challenge and Next Foundation interest of improved data sharing and integration;
Ideally, improved data sharing and the ability to itegrate diverse sources of bio heritage and pest eradication data will provide this community of interest with:
- Enhanced visibility, leading to improved insight for decision-making and strategic targeting of resources
- Improved ability to coordinate and collaborate at the local level, reducing the cost of operations and leading to efficient targeting of action
- The ability to recruit, motivate, enable and retain a wide and diverse range of participants around a shared objective (particularly for predator free New Zealand)
- New sources of investment
- Improved capacity to learn what works
Existing data sharing and integration maturity level and opportunities
[can someone do a brief roundup of current practice]
What is a Data Commons based approach to datasharing and integration?
This reverse brief proposes that data sharing and integration can best occur through what we are referring to as a ‘data commons’. This approach is best described as an information market; a semi-open market environment within which participants can engage in mutually beneficial data sharing and integration transactions.
The term ‘market’ is used here in a broad sense; it refers to an exchange of value which may be of information, insight, trust or improved insight. The market environment that we propose is based on a participant-controlled platform business model. It uses protocol and platform-based technology to facilitate engagements across a network of data producers and consumers.
Some examples of this sort of market-based technology are Amazon, Uber, Bitcoin and Github; platforms that allow a network of interests to engage and transact to transfer and generate value. It is intended that the primary interest served by this facilitation is the common good; value greated by the users of this data commons is redistributed back to the community of interest whence the raw material – the data – is sourced. This model differs from the commercial, profit-driven models such as Amazon and Uber. It is not extractive but generative, in the sense that it is designed for the benefit of the community that owns and administers the platform.
One key thing to note about the data commons model is that it is not a point solution, nor does it create value in and of itself. It is a facilitator for the exchange of value within a community of users, and the value derived will be directly proportionate to the level of user engagement. As more people use and trade through the data commons, the more valuable it becomes to other users. This value is derived via a network effect.
Platform-based business models use technology to connect people, organisations and resources in an interactive ecosystem where value can be created and exchanged. A platform is a business built on enabling interactions between external entities. The platform provides an infrastructure and sets governance conditions for these exchanges, and serves the purpose of value creation through effective facilitation. This is different to the traditional data-integration point solution, which pushes value produced by a single business out to the customer via a centralised, hierarchical distribution model. These pipeline-oriented business models generally rely on inefficient gatekeepers to identify, select and market value to consumers.
The advantage of a platform-based approach to data integration over these traditional models is that openness and inclusivity support an open market for innovation to meet constituents’ needs.
The core engagement
A community of exchange is comprised of producers and consumers: although the two roles are not mutually exclusive, both are required for a functioning market. The core roles in the ecological data market are sensors, analysts and consumers of insights.
Sensors might be human or machine; remote sensors, cameras, hunters, trappers, farmers, citizens, sample collectors, station managers, NGOs and pest eradication groups all collect data about the environment whether it is their stated intent or an unconscious byproduct of their activity.
Analysts add value to the data by examining it and drawing conclusions that are useful for decision-makers. Progress against agreed metrics can be tracked over time, to see whether efforts to achieve a certain end are having any effect. Sophisticated analysis involving multiple controls can even attempt to attribute progress to a particular factor using integrated data.
There are consumers of insight at all levels and across the entire breadth of the interest spectrum. From bait station managers who need to know which bait stations to check, to community groups monitoring their area, and Predator Free New Zealand’s monitoring of the New Zealand-wide situation. Funders investing in biodiversity and pest eradication need insight into their progress, scientists seeking data for research and publication need access to quality data, and school and community groups who want to get involved will engage best when information is available.
To develop a successful ecology data market, we must define the core engagement and have a strategy in place to leverage the network effect of the data market for greatest impact.
Starting-up an Ecology Data Market
The first core objective in the development process is the identification of the core interactions of producers and consumers, because the value released by the data market model is directly proportional to the number of producers and consumers who engage. Maximising value will require a strategy to drive engagement; the classic chicken-and-egg startup conundrum is that engagement creates value, but there needs to be some value in being amongst the first to engage.
Therefore, there needs to be some core engagements which are high-value even during the start-up phase, ie a ‘pull’ strategy.
Discussion with the Bioheritage and Predator Free communities has identified three core data-intensive interactions that will provide both good data for the Commons and value for the participants, and are therefore good places to start.
Engagement service one: Station Management.
Station Management as a core activity is a good place to start drawing participants to the data commons, as this activity has high operational overheads which can be reduced for an immediate benefit over and above data integration.
One core activity of both the Bioheritage and Predator Free community is the operational management of field hardware. These communities operate remote sensors and traps, known as ‘stations’, which perform a variety of functions and yield copious amounts of useful data that could be leveraged to improve management and effectiveness. Examples of these stations include:
- Bait stations, which need to be monitored and checked, emptied when an animal has been caught, and the bait replenished. Higher turnaround means more predators eliminated at a faster rate.
- Remote cameras and other sensors, which need to be checked, have their batteries replaced and data downloaded.
- Seed collection netting, which needs to be checked and samples collected.
This sensing system with all its associated operation management and coordination activity is valuable, but also expensive. The business of operationalising the management of stations can be streamlined by improvements in data capture and use. Standardising the data acquisition process will make the integration of data simpler and faster. There will be high value in developing a systematic and low-cost approach to data collection, with common standards and data capture methods.
We can automate the data collection from stations, using geospatial information capture, wireless transmission of sensor and trap status, QR codes that capture common features of the station, and cell phone location and photographic data to develop tools to support collaboration. Integrating data captured from station management into a simple app that generates SMS alerts when stations need attention would allow station management to become a production line business. This would enable collaboration between multiple actors who could participate more effectively with real-time information about stations in their area, and simultaneously create an entry point to the data commons since data captured via the app would conform to the data commons standards of interoperability.
As a pull strategy?
[Platform Revolution lists 8 strategies for launching a platform. [Page 89].]
Providing a streamlined station Management user experience is an example of a single sided strategy (number five, ) page 95).
A business that lowers costs and improves productivity by generating SMS alerts to improve station management could provide the nexus for initial data capture and integration with the commons. It would provide genuine operational value immediately; an immediate value for the sponsors who have lower costs due to automated surveillance reducing the number of visits to stations. This user pathway can be further bolstered by the use of QR codes, and integration with an app on a cell phone to capture geospatial location data and other relevant information to streamline data capture. This would likely be a huge efficiency gain for participants as well as a data commons capture opportunity.
The efficiency gains may be a large pull factor for people doing this kind of work. It may also provide a pull factor for volunteer groups, who have an easier process to integrating their activity with the wider community.
Engagement service two: "Dob-a-bunny" or "Ferrit Cloud"
A second exchange opportunity exists for the use of remote camera sensing to detect predators. Remote cameras produce a lot of photographs that need to be analysed to be of value. By streamlining the process for uploading data from cameras and making the photographs available to a wider community, we could crowd-source the scoring of the photographs by making them easily available to schools and interested volunteers.
An increase in value yield from the data thanks to a wider pool of analysts and monitors may create an incentive for more sensors to be deployed, and accelerate the scaling of the detection program. There is scope to gamify crowd-participation by opening the data for innovators to create apps and games for participants to get involved with predator detection.
Coherent rationalisation and structure around photographic data has the potential for big data analytics to be deployed over the top of community-based crowdsourcing to build learning models to do some initial triage. This can be used to create learning models at scale to automate the process of filtering.
This market interaction provides a core element of the predator free program, enables data capture and the use of data at scale, and improves the ability of volunteers to participate. It may also create spillover effects for data scientists to develop new kinds of visual recognition products. Further opportunities include dashboard and league tables and other motivation incentives around accuracy and quality, including prizes for schools. This is an efficient way to harness both intrinsic and extrinsic motivation for citizen contributions to Predator Free New Zealand.
As a pull strategy? In this case, Predator Free New Zealand may need to act as a first producer to kick start the platform. Both by providing the right kind of streamlined workflow to upload the photographic data, and by creating motivation-based incentives to engage the community. Given the public value this program aims to deliver, light-touch forms of motivation such as recognition and an awards system are likely to be effective. We should also investigate whether viral strategies are good for this purpose; inducing people to recruit other volunteers through recognition of their efforts in onboarding participants.
Engagement service three: "Eco-health-kit"
A third aspect of core business focused on the Bio Heritage space is the use of genomic analysis of soil, air and water samples to measure organism presence and biodiversity. The presence of organisms in samples is detected using marker genes that operate as ‘barcodes’ for specific organisms. Rather than collecting and storing the whole genome for the organism, specific marker genes are detected, which is a data-efficient way of counting the presence of an organism. A string of such barcodes from a single sample is used to analyse the biodiversity in the environment from which it was collected; water, air or soil.
The genomic sampling of our ecosystem provides useful data about the effects of land use and pest eradication, and is a generalised health barometer for the New Zealand ecosystem. It is of interest to scientists and conservationists, and is likely to be of increasing interest to wider New Zealand, as evidenced by the upswing in concerns about intensification of land use and its environmental effects.
Predator Free New Zealand has suggested that the sampling regime could be expanded to include genomic material from mid-sized creatures such as insects, using data from the Forest Net seed capture program.
Again: streamlining the collection, tagging and referral of samples using QR codes and GIS, establishing metadata standards around initial analysis and genomic extraction, and developing the barcodes for the samples for release to a community of users provides a compelling value proposition.
There will likely be interests from specific groups in the community of scientists and conservationists about the presence or absence of specific organisms, and marker organisms for level of nitrates in waterways. Some may be interested in comparing biodiversity across New Zealand, and the effects of land use activity on biodiversity at both local and national levels over time to create a comprehensive weather map of bio heritage markers.
If this kind of capability can be streamlined through the use of cell phone technology and standardised sampling methods, then there is also the potential for volunteers, the primary sector, regional councils and others to contribute their sampling to the commons. In much the same way that individuals already seek to understand their personal genomic heritage using cheek scrapings, there is value in collecting bio heritage data from the landscape for individuals and communities, and value to scientists in the facilitation of exchange that allows them access to samples. There is both a localised and a common good to be derived.
As a Pull Strategy?
This project is also likely to require a seeding strategy, or potentially a marquee strategy: provide incentives for the user community in the form of special benefits and some supporting infrastructure. For example, providing a testing facility for people who with to contribute their samples to the commons.
Are there other core busines starting points that will engage in data capture and contributuon to the commons?
There are limitless opportunities to innovate and take advantage of new technologies in the Predator Free and Bio Heritage space. With a coherent strategy for collecting, integrating and sharing data in place to tease out the most value from data assets, the return in investment in tech solutions increases.
The development of new technologies that enable precision agriculture also have the potential to be put to work in service of ecological surveillance. A Nelson company, DroneMate Agriculture, has a drone product in development that uses infrared technology to assess the health of plants. It’s conceivable that drone technology could also be used for automated surveillance of forest canopies, identifying areas of defoliation, flooding and erosion, and using autoanalysis to detect changes over time. Sharing the data collected by drones via the data commons, and integrating with other datasets generated by other activities within the community would enrich the evidence base for decision-making at a national and regional level, and provide further opportunities for scientists and innovators to develop and target pest eradication strategies.
Systems for the precision aerial sowing of baits for possum and rabbit control are under constant refinement and redevelopment; New Zealand innovators are developing new technologies to improve strip and cluster sowing, drastically reducing the amount of bait required which not only increases cost effectiveness but also reduces collateral damage. GPS and mapping data are key to the effectiveness of these projects, and support the development of precision techniques that make sowing rates independent of helicopter speed, allows bait distribution to be more consistent, and protects ‘exclusion zones’ from being baited. The users and developers of these technologies are a potential consumer market for an ecological data commons as well as a contributor, as data detailing the distribution of baits could be integrated back into the commons.
Similarly, species-specific control tools are under intensive development, and this process requires quality data. The availability of standardised ecological data through the data commons would reduce the cost of developing and trialling new methods of pest eradication, and where it is necessary for researchers to collect their own specific data, its addition to the data commons would further enrich the resource for all.
Excess value from the commons
The proposed data commons would help the scientific and conservation community meet its need for coordination, collaboration and stakeholder engagement with regard to improving New Zealand’s ecosystem. If a data commons can successfully start up by attracting and retaining users and producers, there are additional spillover effects that also serve the community of interest’s needs.
The following opportunities afforded by an ecology data commons are based upon the assumption that this ecological data were the only data in the commons, and that this data could be integrated geospatially.
- Monitoring, coordination and planning;
Widespread, standardized collection of biodiversity, seed dispersal, pest biomass and pest type data will enable the generation of a comprehensive ecological weather map of New Zealand. This rich, integrated data will provide decision-makers at all levels of the system with quality evidence upon which to base decision.
Another spillover effect in this area is the potential for better awareness of the human resources that are available to the predator free campaign. We can learn how many volunteers are participating, and facilitate their input into the growing knowledge base. We could learn who is managing which bait stations, and which techniques they are finding the most effective; who is interested in participating in predator identification from photographic data, and where there are areas of limited visibility both in terms of effort and information. With this insight, it will be easier to identify opportunities to improve the community’s management of both human and data resources.
Efficient management of sensing and automated activity: The network also provides the opportunity to use outbound calling and SMS to coordinate localized management of sensing resources and stations.
Stoat Lotto (Motivation): The ability to monitor progress across a variety of metrics using integrated data from multiple sources enables gamification strategies to drive participation. Well-aligned incentives to invest in traps that enable participation in reward schemes such as catching the certified millionth rat, league tables for rabbit spotters and photo scorers, and ratings for water quality improvement can all be used as potential motivators to drive participation. These strategies need to be well-considered to avoid the creation of perverse incentives, however.
Entreprenureship and innovation on existing data: A fully open ecological data commons would create oppotunities for innovation, as creative minds devise new ways to score photographs, and find new uses for biodiversity samples that may also have commercial benefits for the primary sector and farming. There’s no way to tell what these opportunities might be presented in advance, but the scope is broad and the potential vast.
Attract other data users/providers: An open market repository for pest and biodiversity data will likely attract other kinds of data, as network effects generate more incentives for data generators to participate. For example, scientists collecting specialised data would benefit from being able to analyse this in the context of the biomarkers in this area compared with other areas, so integrating their own data.
- Raising investment: because there will be real-time feedback of the effects of investment by philanthropists and by individual citizens who might choose to sponsor a trap or otherwise participate, this is also likely to motivate increased investment where results can be seen. Ecology or preditor free bonds might provide capital raising shared forward reduction on risk bonds for investors in the way that social bonds do. Since results can be measured using integrated data, all sorts of innovative investment schemes can be imagined.
Funding and scaling
An ecology data commons for New Zealand will create excess value; the next questions is how to extract some of that value to co-fund the platform.
Excess value can be converted into funding through:
- access to value creation, i.e. users pay to get value back
- access to markets, i.e. users can onsell their services
- access to tools, i.e. users get support for their work
There are shades of a commitment dilemma apparent in the value proposition for the ecology commons. Since most of the users either can’t or won’t want to pay to engage in the commons directly, the community will need to find ways to co-fund a shared resource. Most of the funding sources are likely to be government, philanthropic or research interests.
There are several funding models to consider.
Transaction fee Trading data through the commons offers a massive cost reduction to users, so to fund this efficiency a small fee could be applied to transactions whilst still yielding users a net gain. Many volunteers and operators have little to no cash available for information purchase, however, and the fee may serve as a disincentive to participate if levied upon all users.
Charge for access Since the bulk of the value generated by the ecology data commons falls into the domain of public goods, and there is little direct personal value for most operators, charging a flat fee for access may also serve to dampen the network effect and reduce value by shrinking the participant pool. Apps that are generated off the back of commons data need to be free, or very low-cost, to attract users.
Enhanced access model A graduated scheme that places the a greater cost burden on the users deriving the most value could be a more equitable cost sharing model. Large organisations such as universities, ministries and philanthropic groups could purchase a degree of access to more granular data, with more superficial access available at lesser cost to smaller operators.
Community design brief: Trust and control
In considering the design of the commons, the concerns of all constituents must be considered in terms of the New Zealand Data Futures Forum’s principles of value, inclusion, trust and control.
Each principle’s primacy is different for each constituent. For scientists, who need to publish their research, control is a vital element of the incentive to participate. For volunteers, schools and community organisations, inclusion and ease of access to data resources is key. The Department of Conservation stands to derive great value from the resource, but this needs to be balanced with the trust of data suppliers at all levels, from volunteers to farmers and hunters, if the value is to be realised.
These concerns lead into a discussion of the community’s curation, and the management of the shared platform.
Curration of the community and shared platform
Identity managemet? open? closed? liscienced use?