What is in a name? Moving grapevine varieties to code

By 15 January 2021 No Comments
Graphic image showing the attributes of the cabernet sauvignon grape including colour, region and name

Grape variety.
If you’re in the wine industry, or even just love a glass, then you know how much those two words mean. The authoritative list of grapevine varieties in Australia is held by Wine Australia — and it currently stands at 223 varieties, not including synonyms. That varietal richness is part of the greatness of the Australian wine industry, but it’s also a lot of information to incorporate simply and meaningfully into, say, the interface of an app.

The #Collabriculture project’s first series of workshops reaffirmed the significance of grape vine variety and similar large data sets to the industry and to technologists interested in the field. Central to #Collabriculture’s mission is facilitating our community’s efforts to make this kind of data standardised, usable, and accessible.

Knowing that grape variety is vital information for the viticulturist, the Platfarm app encourages the user to attach this information to blocks in a vineyard as they are named and categorised; getting the picker right when building a tool for growers is a significant user experience (UX) issue. Something as significant as grape variety is information you want people to add accurately and not skip over, but long lists of options present users with a lot of information. The potential for overload or avoidance must be considered when managing, storing, and displaying these kinds of data sets.

Platfarm addresses this issue in part by working from an abbreviated list of common grape varieties. This approach is working well for us because we’re an app built by a grower for growers, giving us a lot of insider knowledge on the varieties being grown not just in Australia but in the specific regions where users are concentrated. But it’s a list that we add new varieties to regularly — because we’re responsive to users and we can’t preempt all growers.

Technically, adding a new variety to a list in an app isn’t hard, but it is work. We need to create a ticket, and then our developer needs to make the addition, add it to the build, and push out the update in the next release — and then we’ll let the original grower know that the variety is now in the app for them to choose. Depending on the length of a sprint you could be looking at two weeks for a new item to get added to a list. It’s not insignificant.

So last week, as part of the #Collabriculture project, we created a repository listing all of the Australian grapevine varieties in GitHub.

Image showing a file in the github repository

If you’re not familiar with GitHub, it’s a widely used development platform; essentially a code repository where developers, researchers, and companies can store, build, and maintain software.

Sometimes likened to a library, at its best we think of it as more of a community garden, where people with a range of skills and experience bring in plants or even the seeds of ideas, tend them, weed them, prune them, and grow more, building and sharing their knowledge and resources along the way. Anyone willing to do the work, maintain the logs, and respect the project is welcome; take what you need and contribute what you can. Private repositories — walled gardens — are available, but those who purchase privacy forgo community contributions.

Since it was founded in 2008, GitHub has become a fundamental tool for developers and people working on a range of projects. People can and do collaborate by adding, updating, or even changing code projects, leaving comments, and discussion along the way. Code from publicly accessible areas can be used in other projects. Putting projects in a repository like Github makes it accessible: it becomes searchable and useable.

Creating open, shared, and accessible standards is one of the fundamental goals of #Collabriculture, so that the community can all work from a common base. In practice this means having information in places where people can find and use it, like central industry bodies or the #Collabriculture website.

But why put something like grape vine varieties in GitHub, a third location, when Wine Australia has the authoritative list and #Collabriculture has a website? 

It’s important to understand that this data is much richer and both more complex and more efficient in a coded format than as plain text or an ordinary list or table.

For example, to list 5 categories of information about 1 variety:

Cabernet Sauvignon (cabernetsauvignon), AKA Cabernet, is a red variety most notably grown in the Barossa, Coonawarra, Langhorne Creek, Margaret River, and McLaren Vale regions.

That’s grape variety name (database ID), synonym, colour, major regions.

Presenting this information as plain text is inefficient for programmers at the point of use; presenting it in a list or table further makes meaning dependent on relative spatial placement — pull the word out of the table and the other information is lost.

Storing and managing grapevine varieties in a coded format means, ideally, that Platfarm or anyone else could pull this dataset as required in an immediately usable and information rich form. It would always be up-to-date — if a new variety was planted in a region of Australia it would be added locally but anyone connected to the repository would benefit — and categorisations within it could be expanded or modified over time according to the needs of all users.

That’s the goal, and it’s very realistic. But it is a project.

At some point someone (probably me) is going to have to do the work of categorising the 223 varieties into something more manageable.
Perhaps breaking it down into grape colour first? Or taking the approach of the List of Australian wine grape varieties on Wikipedia and breaking it down by region?

Regardless of which approach we choose, we believe in sharing it with the sector to save some other team the pain of doing this themselves down the track. Because let’s face it, we are all time and resource poor, we all benefit from getting the fundamentals right, and the less barriers there are to building accurate and meaningful tools the better off the entire industry will be.

Watch this space, because it will be going into the #Collabriculture Github repo…
Is this a project that you or a member of your team would like to collaborate on?

And if you have existing resources to contribute, please share.