Artsy Engineering Radio

21: Building Artsy's Android App

David Sheldrick & Steve Hicks. Edited by Aja Simpson. Season 1 Episode 21

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 27:05

David Sheldrick and Steve Hicks talk about Artsy's journey to build an Android app. This episode is filled with tips and perspective on breaking down a large project into actionable steps.

Steve Hicks:

In this episode of Artsy Engineering Radio, I talk to David Sheldrick about how to approach a really big project. In our case, it was the Artsy Android app. But David shared some lessons about breaking problems down and finding risks that are applicable to any big project. Hope you like the show. Hey, friends, welcome to another episode of Artsy Engineering Radio. I'm your host today, Steve, and I am here with my friend David. David, do you want to say hi, and introduce yourself?

David Sheldrick:

Hi, yeah, I'm David. I've been at Artsy for about three years now, working mostly on the mobile app.

Steve Hicks:

And doing a pretty bang up job on the mobile app for what it's worth. And honestly, David, that's what I want to talk to you the most about today is your work on the mobile app, especially the fact that, well, Artsy has had an iOS app for quite some time built with React Native, which seems like, you know, if we've got a React Native mobile app, that we would also have an Android app to go along with it. But we didn't. It took us a long time to get it. So that's really what I want to talk about with you. So maybe you could take a step back a little bit and kind of give us a history of our React Native project.

David Sheldrick:

Sure. So Artsy actually started with just a pure iOS mobile app, written in Objective C. And React Native was added about four years into the project in early 2016. And it was added in such a way that we could write individual parts of the app in React Native, but it would mostly just be the UI and a little bit of state management that would be then used from the main objectives. So it was a brownfield React Native. And over time, we added more and more screens in React Native, and converted a few of the existing iOS screens to React Native as well. And eventually, we got to a place where we thought we want an Android app. So you know, to reach Android users, and what is it gonna take for us to get there, which I think is what we're gonna be talking about.

Steve Hicks:

Yeah, definitely. Were you involved in that introduction of React Native to the Eigen? Or did that all pretty much happen before you were around?

David Sheldrick:

Yeah, that was before my time, by a couple of years.

Steve Hicks:

Okay. And then I imagine that you did get your hands on some, like, especially at the beginning, you were in there doing some new React Native stuff, even with a whole lot of iOS or native Objective C code to go along with it?

David Sheldrick:

Yeah, my first project working on the mobile app was converting the artwork page, which is, I think our most viewed page. So the artwork page or the artist page, but probably the artwork page. And, yeah, that was a fairly large project, because there's a lot of rendering logic based on what data that we have about a particular artwork, whether it's in an edition set, and there's complicated interactions, like deep zoom on artwork images, so that people can see the fine detail of an artwork. And yeah, that took us a few months. So that was my first exposure to the Artsy app.

Steve Hicks:

Sure. We talked a little bit about edition sets on a previous episode of this podcast, as it's like a thing that we always kind of forget exists until we get really close to launching something. And then it's like, oh, we forgot edition sets.

David Sheldrick:

It took me a good couple of weeks to understand even what they are.

Steve Hicks:

Yeah, exactly. So at some point, I think we decided that we were going to try to do more things in React Native versus natively in Objective C. And I'm curious if you have any context on why that happened, like what was the driving force to start moving towards doing everything in React Native,

David Sheldrick:

I think the driving force was productivity. If you had a lot of web engineers, who were completely unable to contribute to an Objective C app, and the experiment that happened in 2016, of adding React Native showed that we could take these people with web skills and getting them being productive on React Native app pretty quickly in shipping features. And at some point, I think around the time I joined, there was an explicit discussion about which language we want to write future code in. And at the time, there was a bit of Swift, a bit of Objective C, a bit of React Native, and we decided that pretty much all feature work should be done in React Native, Swift should be not worked on - only maintained - and Objective C should be used for any kind of Native features that we need, for example, we added an augmented reality feature to let people see artworks in their own homes. And that was done in Objective C, you really couldn't do that in React Native at the time.

Steve Hicks:

Right. Is that something that we would be able to do now in React Native, do you think?

David Sheldrick:

I think that is still a fairly experimental area. I've definitely seen people doing computer vision stuff with TensorFlow js, or maybe using native bindings. So yeah, I think it would definitely be more doable today than it was back in 2018.

Steve Hicks:

Cool. That's good news. So we have this iOS app, and we start moving towards React Native, as a JavaScript developer myself, I really appreciate Artsy's decision to do everything in React Native, instead of Swift, because it means that I can actually contribute to the project, which is cool. Now that we're headed down that path. I know since I started at Artsy, like I had been hearing us talk about the idea of an Android app, but not much more than talk. However, in the last...I don't know, when did you really start thinking about this in the last year or two? You drove a lot of work to actually go ahead and build an Android app in our React Native codebase. And honestly, it's impressive, like the amount of work that you had to do to set us up for that. So I'm just kind of curious if you wanted to just talk about how you approached this project, like it's huge.

David Sheldrick:

Even before I joined those kind of an understanding that one day we will have an Android app and that was what React Native was about. And then about 18 months ago, someone said, "Oh, you should write a technical plan for what it would take to make Android happen," because I talked about it in, you know, meetings and knowledge shares about what we would need to do. And I converted all those thoughts into a technical plan. I think you've talked about technical plans on this podcast as well.

Steve Hicks:

Yep. We have a little.

David Sheldrick:

Yeah, a written document that describes what you want to do to achieve a particular technical goal, in this case, making an Android app happen. And it was a quite a high level plan, you know, because it's a huge project. There's way too much to go into detail on. But we had outlined a couple of different strategies, that it would be possible to take to bootstrap an Android app with different trade offs. One of them was just, you know, start writing a new app from scratch in React Native. Maybe we could use the existing React Native components that we had in TypeScript, but probably not because they had some dependencies on existing application infrastructure that was written in Objective C. But that would have probably gotten us an Android app quicker than what we did. But it would have fragmented the code base. So you'd have to write everything twice. And it would be in a separate Git repository as well, so when you're implementing a new feature, you have to touch multiple repositories. And we decided that it would not be effective long term to do that. Well, the approach that we decided that we wanted to pursue, was to gradually refactor the existing parts of our app code bases to make them look more like a normal React Native app, the kind of app that you get when you run the project initialization script. So yeah, that type plan was about 18 months ago, and there was a lot of discussion about it. But we still didn't really have exec buyin - by exec I mean, the CEO, basically. There was no mandate from leadership to say, "Yes, please start building the Android app now" at that point. That came about six months later. I'm not sure what the delay was. But at that point, we then kind of converted the tech plan into more fine grained steps. But even between those two points, because we had this idea of a strategy to pursue, we could actually find parts of that strategy which were even beneficial to building just the iOS app on its own. Like, even if we never made an Android app, the kind of infrastructural refactorings that we did during that time, really helped improve the developer experience of the app for people who just build on iOS.

Steve Hicks:

Yeah, that makes sense. That feels like one of those things, you know, like accessibility, where just making these changes, while it has a specific audience that it might be targeted towards, it just makes it better for everybody.

David Sheldrick:

Yeah, I think one of our most impactful things that we did was to combine the repositories that were involved in making the Artsy app before. We had two - there was the Objective C app, and then we had all our React Native stuff in a separate repository, where all the TypeScript code lived, a little bit of Objective C code, and then that was rendered as a cocoa pod, which the iOS app consumed like a normal cocoa pod dependency. So that meant doing a lot of development, you would have to update two repositories at once. It was kind of difficult to work on both at the same time, like to link them together when you were doing development. So we combined the two repos and simplified the development workflow a ton.

Steve Hicks:

Yeah, that was really helpful when we did that. I think we did a similar thing on the web side at one point, also. We had too many services, like for the development workflow, it was just too much to try to link all these things together. To just make it one project was was really nice for developer workflow.

David Sheldrick:

Yep. And then after that, we eventually got mandate to start explicitly planning and working on the Android app. And we broke down the remaining chunks of the technical plan into smaller bits. That basically amounted to infrastructural changes, and converting Objective C user flows into TypeScript user flows. We have a few major user flows in Objective C like onboarding, and live auctions. And we decided that we wanted to start working on the onboarding stuff first, or at least, that's what product said that they wanted to happen. I think product, you know, they think of user needs ahead of everything else, which of course they should. But realistically, we had all these infrastructural changes that need to happen before that even made sense to think about. We also broke down the infrastructural changes. So things like navigation, environment configurations, push notifications, all those kinds of native integrations that are difficult to do. At least difficult to refactor, because they touch the entire app.

Steve Hicks:

Yeah, I think when I think about prs pushed up by you, I think of those kinds of prs that are like refactoring in a cross cutting way that touches basically everything in the app. And it's always like, those are very courageous types of pull requests to open. And that kind of work, we couldn't do it without that kind of work. And I'm curious, I guess, when you approach those cross cutting things, and you've got this huge refactoring that needs to happen, is there any way that you can break that work down? Or how do you how do you handle that kind of PR? Is this the thing that you bang your head on for weeks? Or is it a thing that you come back to every now and then or you're able to break those things down into small pieces?

David Sheldrick:

Some of them could be broken down into small pieces, some of them really couldn't, especially the navigation. So that was definitely a situation where I had to bang my head against the wall for a few weeks. And, you know, the reason is that we don't want to block people doing product work by having a broken app for a few weeks, while I do everything in incremental chunks. And also, we don't want to have these long lived prs that people have to review over and over again. So yeah, I broke them down as much as I could. But yeah, like you said, there were a few big PRs that touched basically, the entire app, and I got very good at QAing every screen in the app.

Steve Hicks:

How did you get very good at QAing? Did you have a script or any automation to help you with that? Or was it you just kind of started to memorize these workflows and work through it yourself?

David Sheldrick:

Yeah, I had a list of all the screens in the app, in the code base itself, because of how we register screens. And what I would do is just go through every screen, check that they all render properly, check that the interactions work. So it was all manual, there was no scripting. But once you learn how to navigate the particular parts of the app, it becomes pretty quick.

Steve Hicks:

Right. Almost second nature, a lot of good muscle memory in that I imagine. I was wondering if maybe you could talk maybe at a higher level about breaking this problem down? You know, we talked about how you were able to identify these cross cutting concerns like redoing the nav and etc. But I'm curious if you have any more insight on what it took to to break that work down?

David Sheldrick:

That's a really good question. Well, it definitely took a deep knowledge about how things work, and also I had to do research by going into the codebase and looking at how things fit together and how they need to fit together at the end of the refactoring. And identify risks involved in those things. Like, is this going to break this other part of the app if we do this? Is there anything we can do to find out if there are unknowns? So a lot of the time we would do little spikes during this planning work. But yeah, mostly, it's just code spelunking, and app spelunking. So actually using the app to figure out how things work. A lot of the time we were finding bugs during that process as well.

Steve Hicks:

Yeah. But I imagine you know, it's a little bit of a nice feeling, knowing that nobody's actually using this Android app yet. And so those kinds of bugs that are affecting Android only are easy to maybe prioritize and try to get the the big infrastructural stuff done first. So one of the technologies that I know that we are big on here at Artsy is TypeScript. And I'm curious if that was a factor in any of this breaking the problem down and rewriting things? How did that affect how you thought about what needed to be done?

David Sheldrick:

That's another really good question. So a lot of the time, we were refactoring Objective C code into TypeScript. And I've been using TypeScript for a long time. So it was fun for me to be able to take someone else's code, which is, you know, always worse than your own code, and then make something that I felt cleaned things up a lot. Sure. For example, we have routing code, the way that we navigate around the app is by using URLs. So that was compatible with our website. And we have routing infrastructure right there.

Steve Hicks:

Right, they match. The routes match between the app and the website.

David Sheldrick:

Yep. And we have routing infrastructure to decide which screens to show for particular routes. And that had all been done in a fairly kind of ad hoc way over time with the Objective C code. And that happens, that's how applications evolve over time. But then we have this nexus point where we're converting all to TypeScript. And that gave us a lot of scope to clean it up, make it really nice, make it easy to extend, make it just kind of resilient to change or resilient to bugs. Yeah, so that's an example of where TypeScript helped.

Steve Hicks:

Yeah. For what it's worth, when I do have a type problem, you are the person that I immediately think about. So just speaking to your your expertise with TypeScript there. You mentioned the word resilient there in talking about some of the changes that you made. Did it feel good as you were going through this process to just feel like you were able to make improvements in that way? Going through this, it sounds like you weren't just converting things over to work for Android, it was also making improvements so that, you know, our Android work and our iOS work would all benefit from from those changes.

David Sheldrick:

Yeah, exactly. There's a few other places where that happened, for example, state management. Again, all the state management in iOS was just in various different places using different approaches. Also, in TypeScript, we had a few little bits of state management that we managed to clean up as well, and consolidate everything into one single source of truth. It's really nice infrastructure for doing testing. And, and yeah, they just had a lot of opportunities like that to really simplify the developer experience, make the app more robust.

Steve Hicks:

Cool, we are definitely going to all of us at Artsy are going to be thankful for those changes. So thank you. I imagine, as you were going through this, that there were a lot of risks that you would run into. And I am wondering how you...you mentioned earlier that just having a lot of app knowledge, knowledge about this app was maybe helpful. But are there any other things that were helpful for finding risks throughout this process?

David Sheldrick:

So yeah, one of the things that we did to reduce risk was to cut scope. There's a few features of the iOS app, which are implemented in Objective C, and which we don't necessarily need for an MVP Android launch, for example, the augmented reality thing I mentioned earlier, and we just decided to not make that for Android. And that was a nice, nice way to cut down risk. Because, you know, we've never done augmented reality on Android before, that probably would have been a huge project. Another thing was, we didn't want to launch an Android app that felt like an iOS app. You know, there are UI patterns that only make sense on iOS. And there are UI patterns that only make sense on Android. And we wanted to make sure that we weren't getting those mixed up. So we did some research on how Android UI works, to make sure that we weren't going to be doing anything that felt bad for Android users.

Steve Hicks:

Did you incorporate a designer in that also? Were there some resources that we were able to take advantage of that they were able to help us with? Maybe they had a sense of what would be good patterns to use and what wouldn't?

David Sheldrick:

Yes, so when I say we did that research, what I meant was our team, which included a designer, Sam, who has left the company, I really miss her. She did some great work finding navigation patterns that would make sense on Android and what we needed to change.

Steve Hicks:

Cool. What went wrong in this process? Because I imagine lots of things did.

David Sheldrick:

Yeah, a few things went wrong with big infrastructure changes comes a big potential for breaking stuff. One thing that we broke a couple of times was deep link behavior. So when you receive a deep link from a push notification, or when you receive a push notification, and you tap it, it opens the app via a deep link, which takes you to a particular screen a couple of times, because we were making all these changes to the navigation infrastructure. It actually broke the deep linking behavior. We obviously tested during our QA sessions, but we still broke it in ways that we weren't quite catching. We had a really big problem one time when we were refactoring the environment variables and environment configuration code, where we have a staging environment and a production environment. And then obviously, the users use the production environment, and we use the staging environment for development and testing. And we had it set up by accident so that when we deployed the version of the app, it forced all of our users who had downloaded that version of the app into the staging environment. So they were seeing test data.

Steve Hicks:

Oh nooooo

David Sheldrick:

Yes, it was a huge problem. And the only way that we could fix it was by releasing a new version of the app that fixed the problem, pushed users back into production. And yeah, that was frustrating.

Steve Hicks:

Did that happen around the time that we started thinking a little bit more about kill switches for old versions of the app?

David Sheldrick:

Yeah.

Steve Hicks:

I feel like there was maybe some work where I think it was around that time where we said, You know what, we really need to be better about being able to completely kill the app for people who have bad versions.

David Sheldrick:

That was exactly the thing that caused those discussions.

Steve Hicks:

Okay, cool, then I'm remembering correctly. That's great. Any other things go wrong, tremendously?

David Sheldrick:

Not tremendously. We had some small UI issues, especially around safe area insets and things because when you refactor navigation infrastructure, you change the way that screens are presented. And sometimes that can impact how they render.

Steve Hicks:

Sure are safe area insets, is that related to like the notch thing at the top? Is that what that's about?

David Sheldrick:

Yeah, so notches, and on Android, you have these little camera holes. And also, rounded corners can affect that as well.

Steve Hicks:

Sure. of all those things that went wrong, when you look back at this project do you feel like it's reasonable to accept that these things would have gone wrong? Do you feel like more things went wrong than you expected, less things went wrong than you would have expected?

David Sheldrick:

Expectation is hard to judge because I've never worked on a project like this before. I do wish that we'd done more thorough QA of some parts of the app before we deployed. Aside from the staging thing, there were no huge problems. It was mostly mainly minor rendering stuff. And also, maybe bugs in flows that users don't use very often. Because we have a pretty thorough manual QA process.

Steve Hicks:

Yeah, definitely.

David Sheldrick:

But yeah, it could have been more thorough. And I think in the future, I would probably double down on manual QA.

Steve Hicks:

Okay, you would prefer that to any sort of - I don't even know what kind of automated QA you can do with a mobile app. I mean, I imagine since it's React Native, like we can have as many automated tests as we can get in there. Probably ways to test that on the native side, too. But do you feel like that that manual QA significantly was more successful than any automation we did?

David Sheldrick:

We haven't done any test automation. I mean, we have unit tests automated, but we don't have any integration tests. You know, I'm thinking of like what people do on the web with Cyprus. Or what is that framework that people use before Cyprus came along? It begins with a P...

Steve Hicks:

Protractor is one of them?

David Sheldrick:

No, that's not what I'm thinking of. Anyway, yeah, we don't do kind of end to end tests that use the actual UI to trigger test cases. Because it's quite difficult to do in a non flaky way. And yeah, I think manual QA, you get a lot more, you know, if the people have good eyes for design and user experience, then you get a lot more safety in terms of not deploying things that make users feel bad, which I think is really important in mobile apps, and everything,

Steve Hicks:

Especially mobile apps. And so that actually, is something that I have learned about mobile apps in working on a team with you for a short time. As a web developer, it's really easy for me to think like if I ship something to production, I can just really quickly undo it by reshipping. But that's not exactly a thing that could happen with a mobile app. And it's something that I always forget. And then remember it as I think through problems. But if I ship something in version 6.1.2, and then fix it in 6.1.2, there's still people who are going to have 6.1.2 who never update their app or don't update their app for months or something like that. So once the code is out there, it's out there. That's a thing I always forgot.

David Sheldrick:

Yeah, some people don't update that apps for years in fact.

Steve Hicks:

And I imagine we've seen that in our analytics.

David Sheldrick:

Yes. I don't know when the oldest version of the app is currently in active use is, but it's at least two years. Yikes.

Steve Hicks:

I didn't even know it was possible to have a device that was two years old. Just kidding. Okay, so David, I just want to say thanks for hanging out with me. And I'm seriously going to miss you. I didn't announce this earlier, but as the harbinger of doom on this podcast, I'm doing an episode here with you right as you're about to leave Artsy, which is really sad to me. So thank you for all the tremendous work that you've done here. And this Android project, I can't even describe how big it is and how important it is to us.

David Sheldrick:

Well, thanks, Steve. I'm really gonna Artsy. And everyone who I've worked on the Android project especially with. It's been one of the highlights of my career for sure.

Steve Hicks:

You did a great job. We appreciate it.

David Sheldrick:

Thanks.

Steve Hicks:

Cool. Thanks for hanging out, David.

David Sheldrick:

Cheers, Steve.

Steve Hicks:

Thanks for listening. You can follow us on Twitter @artsyopensource. Keep up with our blog at artsy.github.io. This episode was produced by Aja Simpson. Thank you Eve Essex for our theme music, you can find her on all major streaming platforms. Until next time, this is Artsy Engineering Radio.