I sat down with Mariah Kenney, one of our PSC Data Curator and Metadata Librarians to talk about her work with the Brain Image Library (BIL) and other projects.
CC: Mariah, welcome. When we bumped into each other in the lunchroom today we talked a little about your title, Data Curator Metadata Librarian. How would you define it?
MK: I don’t have a background in librarianship, but, as a data curator, I feel like I can dig down deep into what is the meaning of this data we are collecting and how we represent it in ways that are useful for people. Curating, gathering information, making it accessible. When I think of data curators, I think of data geeks, people who are obsessed with data and learning about what it is and what you can do with it. Whereas the librarian part of that is more of the cataloging, describing, and organizing part.
CC: What do you feel is your role here at PSC? What’s your wheelhouse?
MK: Before the DCMLs were here, there were people trying to fill this data curator role. That’s why we were hired — there was a need for it that overlaps a lot with software engineers or people who are designing databases. How do I design this in a way that makes sense for the data? Everything that we do is trying to take all of these computational resources and information systems and put them in the context of data, data users, and the scientific community. But my main wheelhouse is BIL.
CC: Putting everything in context for the community, that’s really useful. Let’s talk about some of the things you’ve been working on, for example, BIL workshops?
MK: I aim to do three or four a year, each one is a little different. Some are devoted just to data submission. Some are data submission and data exploration, because BIL is not only a repository but also an analysis ecosystem. How do we help people understand that they have access to both of those things and how to use them?
I’ve gone to a lot of conferences. We just had a booth at the Brain Initiative conference to talk to users and data submitters who are in the Brain Initiative consortium, to let them know what we’re doing and all of the new developments that are going on. We’re constantly changing and evolving and making more tools available. Later this year, we’ll be at the Society for Neuroscience Conference.
I also spend a fair amount of time mentoring students. I’m working with a summer internship student right now and also mentored a student intern last fall. As a data curator, it’s fun for me to work with students who are not in the same field. Most are computer science majors. Data curation is starting to branch off, becoming something that everyone needs to start doing. If you’re working with data, you’re going to have to curate your own data at some point. I also gave a guest lecture on data curation for the Data Analytics for Science master’s program, because those students also need to know how to interpret metadata.
I’m working on a project with the Data Curation Network, a larger network of data curators. We’re working on a project creating a primer of educational materials for other data curators. We have a cohort of four data curators who, like me, work at specialized or general repositories, universities, or nonprofit organizations. We’re working together to create educational resources for other data curators, specifically how to curate imaging data.
CC: How does your work here compare with other places that you have worked before?
MK: I had a whole other life and career trajectory before I was at PSC, as a biologist. I worked in biology labs and wet labs, as an experimental evolutionary biologist. And I still take so much of what I learned in the lab to being a data curator because I can come at helping all of our data submitters with the mindset of someone who was in their shoes before, who understands what they’re going through.
I loved working in a lab. I think I learned so much doing that. But it’s nice to be on the other side, too. And before I was at PSC, I worked at a nonprofit called Dryad as a data curator. Dryad is very different, it’s a generalist repository. While that was exciting to see different types of data every single day, at the same time, I enjoy working on projects like Brain Image Library (BIL) because it really gives me the opportunity to build much more rapport with the people who are submitting and creating a collaborative community for what we’re trying to do.
CC: I’m sure the data submitters that you work with are glad to have someone who’s been in their shoes before.
MK: It’s important to realize that as a scientist, you have a very specific set of pressures and deadlines that go along with your job. So, knowing that I was there before, understanding what their priorities are, a little bit. You can’t understand everything, of course.
CC: We’ve talked about what you find rewarding about your job… What are some of the things you find more challenging?
MK: For me, I feel that I can really lean on my fellow DCMLs, Brendan and Jackie, if I’m completely out of my wheelhouse. But because so much of what we do is working with data from some of the world’s leading scientists, and they are creating the techniques as we’re dealing with the data, we have to learn about something that there’s only been a few papers written about. I need to figure out how to learn about all these completely new techniques and how am I going to deal with that data? I think that’s the hardest part for me, just constantly keeping up. But, that’s fun as well.
CC: Mariah, what is your vision for BIL going forward, the big picture view?
MK: I don’t think people realize how cool the Brain Image Library truly is. I don’t know of many other repositories that give you direct access to computational resources like that. We’re gathering so much data in a way that it is easy for people to reuse it right now. You don’t even have to download it. BIL is going to be a wonderful resource for the next generation of scientists, because as we continue down this road of gathering more and more information, we need to figure out how to interpret it. I think that the next generation of scientists will make neuroscience more accessible to the global scientific community.
I’m excited to see what research comes out of it because with the amount of information we are currently storing at BIL, there are so many novel scientific questions that we could answer. Not just within the Brain Initiative consortium, but the global scientific community. That’s where I’m at, data for all. Also, it is really only useful when people can use the data, that’s kind of a test of my job, the greatest test: You’re looking at the data and the metadata. Can you do what you need to do?
Mariah Kenney, DCML