Overcome Releasephobia - 5 Tips to Simplify DevOps



Presenters     The 5 Tips in a Nutshell
Mike Jones Mike Jones
VP Global Marketing
  • Unique Fingerprint
  • Tracing the Victims Footsteps
  • No Access to the Crime Scene
  • Lack of Motive
  • The Graveyard Shift
Miguel Baltazar Miguel Baltazar
Technical Services Director
Duration 45:00 mins  




Mike: Welcome to today's webinar, "Overcoming Releasephobia: Five Tips to Simplify Dev Ops" Today's session is the third in a series of webinars on IT productivity. Our first webinar in the series featured John Reimer of Forrester Research where he shared his insights into a new class of application development environment that Forrester is calling the "New productivity platforms". That webinar was all about this new class of application development environment which are focused on dramatically speeding up delivery of custom apps and helping IT become more productive. We followed this webinar with one focused on agile in the enterprise where we outlined nine key rules for success. We invited Dave Thomas, an early agile evangelist and actually the founding director of the Agile Alliance, to share his insights. You can listen to both these recorded webinars and get the complete Forrester report from OutSystems' Event page at OutSytems.com. Today I'm your host, Mike Jones, and for our session we've asked Miguel Baltazar who's our North American Services Director to elaborate on what we think is the final hurdle to improving IT productivity. That hurdle really addresses the need for rapid release cycles and improving the collaboration between your development and operations teams. Today Miguel is going to share five tips to overcoming what we like to call 'releasephobia' and give you insights into how you can streamline your dev ops collaboration and processes. Welcome, Miguel.

Miguel: Hey, Mike. Thank you for having me.

Mike: Miguel, to kick us off why don't you set the stage as to why overcoming releasephobia and improving dev ops has become critical for most IT shops.

Miguel: That is exactly what we've been seeing in most of our customers. The reason is that most of the time especially in the later years, with the ascent of agile methodology, development teams have become clearly more productive, more iterative and they release, using methodologies like scrum, a potentially shippable product every two to four weeks. Now that is great because it allows the business to see the value faster. It allows development teams to be more productive. What ends up happening in the end the operations teams is not able to release that fast. What we see is that there is a lot of pressure being built up on the operations teams to keep up with the speed with which the development teams are delivering their releases. That is actually a very complicated process. That's because, when people look at staging a release of the software from pre-production to production, they see it's a myth between science and art. A lot of them think it's a magical process that has no predictable outcome; it could go really well and everything is released properly or it could fail miserably and we have to roll back. I remember some years ago when I used to work for a financial institution there was this myth in which if you release your software on the odd week of the year you would probably fail. If you release it on an even week of the year you would most probably be successful. There was this thing about the week being even you'd be more likely to have a successful release. There are a lot of myths around it. What I would like to share with you is what we've been seeing in our customers and prospects and also in past professional experiences that actually slow down development operations, stagings and release processes and how we can fix them. To give you an analogy, what we saw was that doing a pre-release of software these days is almost like committing a crime or robbing a bank. There's a lot of planning involved. Everybody has to spend multiple days planning. Everybody's got a specific time in which they need to do this. It's actually very precise. Everybody has their own role they need to be at a certain place at a certain time to pick up the guys exiting the bank and drive away, etc. What I would like to do is show you a little bit of the forensic of development operations and what we see as the top reasons why development operations are slow.

Mike: I couldn't resist interjecting just a little bit because I love the CSI analogy. I remember in my prior career as a product manager whenever we had a new build coming out of the product it was always a question whether or not it was going to be DOA, dead on arrival. I guess that aligns with this concept of forensics and crime scene investigation. Sorry to interrupt you.

Miguel: No, that's fine. It's a little bit like that. A lot of times deployments fail and no one can really tell why. That's the whole magic part behind it. One of the major reasons is that each release has its own fingerprint. No two are alike. There is a laundry list of items and steps, sometimes you need to touch the firewall, sometimes you need to change the database, other times you need to do upgrade scripts on the data and what happens is a lot of these people sometimes need to get involved. You have to do a lot of one time documentation. In order to get all the scripts together you need to spend hours putting together documents. It's one of the few parts of the process that I've been involved with in which you write documentation that you hope will never be used in the case of the roll back script. The best hope you have for that five or 10 page document you just wrote is that it goes directly in the garbage and no one has to use it. This is a pretty painful process for developers and people planning these releases. How can we try to address this? The most important thing is to go a little beyond the piecemeal tool. Make sure that you are able to automate not only each one of the deployment processes, the upgrade of the database, the upgrade of the data, etc. but also that you're able to automate the process in an end to end way and know exactly what the steps are, create a repeatable pattern for that specific application. Of course, each application has its own fingerprint but once you understand what that fingerprint looks like you will probably be able to automate a lot of the end to end process. Most importantly, if you are able to automate that you can have standard roll back mechanisms that allow it to roll back for the specific application that has been deployed and avoid creating that one time documentation that you eventually will hopefully put into the garbage can.

Mike: Miguel, we've got a question from Bill in the audience. He asks, and I don't know how much experience you've got here, but his question is what tools are available in the market to address the release process?

Miguel: As I said, there's a slew of best of breed tools that allow you to do some things specifically: version control, code deployment, database differences in deployment, etc. What I haven't seen really, especially in the major tools that are out there, is something that has an end to end approach and actually automates the process. It always ends up being the operations people that have to understand how to use each one of the tools and go through the laundry list of planning. One of the things we try to do at OutSystems is create a way for all these processes to be automated end to end. Tools like subversion and rational and ping foundation services, etc. that try to do this, usually fail in one or two of these processes. I cannot tell you exactly all the tools available in the market but from my experience I haven't seen one that allows this process to be completely automated. Moving on, the next thing we see that really slows people down is the need to trace everything you've done since the last release. In a lot of the places where I've worked, prior to being at OutSystems, you would have two major releases in a good year. That means each time you plan a next release you have to trace everything that you've done since the previous one, make sure you have all those changes in a row even if you're really good and you're able to have the exact set of changes that you did to the database or did to the code or did to the process. What ends up happening is you don't really remember what happened in the last deployment. Maybe at the last minute something was not deployed because it was not correct and now you're assuming it is. Very likely you will fail and things will not go as you expect. So not only do you have to spend a lot of time tracing what you've done, most likely you're going to fail. It's an inglorious kind of work. How can we fix this? The most important part is being able to do automatic impact analysis and differential between what is in production and what you're trying to release. Hopefully that can be done in real time as you're deploying so you're not hindered by data that has changed since you looked at it to where you're trying to deploy right now. You're also upgrading all application layers. You're not just upgrading just the codes. You're not just upgrading the database. You're also upgrading things like business processes. You have an application that is live and in production, there are a lot of business processes running and you're going to slight change that business process. What is going to happen to the business process that is currently in that status that you change? That is nothing you can do 50 days or 10 days in advance. You need to do it, more or less, at the time you're performing the upgrade. It is very important to be able to make those assumptions as you go. Last but not least and most importantly, make sure you do this without disturbing production. what ends up happening is you only want to do the change in production last minute, when you're ready to do it and you're pretty sure things are ready to go.

Mike: Miguel, we have a question from Ranjit in the audience, who asks what is really the scope of the suggested approach? It is focused for internal release to customer or are you also thinking about customer to industry, like FDA regulated environments? Do you have any feedback on that?

Miguel: That's a great question. We have a lot of customers that work in regulated environments both in FDA and even Socks, etc. They really like the ability to do the impact analysis piece. That is exactly what they're looking for. They want to understand that the version being deployed to production is just deploying the approved, end tested changes. The ability to do this automatic impact analysis and not just provide wishful thinking that you are deploying those changes is something vital. In those kinds of environments, although they are regulated, nothing prevents the operations people from accepting the systems and understanding what the differences between what is being deployed and what is in production. The regulations go mostly in terms of who can do what and segregation of roles.

Mike: Thanks, Miguel.

Miguel: The third thing, and I've already hinted at it, is the fact that developers do not have access to production systems or to pre- production systems for that matter or even sometimes to QA systems. They end up working based on an assumption of what is in production more than the ability to look at configurations and logs and code that is in production, what is running. We talked about regulated systems and a lot of times this is imposed but most of the time it's mainly because operations people want to hold the ability to manage their own service. If the operations people are not what usually happens is a few days before the deployment they allow you a guided tour of the server. You do a Web-X and they drive and you tell them what to do. It actually reminds me of that TV series where the female detective is able to look at a crime scene and automatically, with a photographic memory, see everything and she never has to look at it again. Unfortunately, developers are not like that. They have to do this on their own time as they figure out what needs to be deployed. It's not a half an hour guided tour of the production or pre-production server that's going to get that done. How can we fix this? The solution is pretty obvious. You need to provide cross environment visibility. Find a way to allow developers to access the relevant information on production when they need to. Allow them to compare codes, allow them to compare configurations, see what is the difference between the various environments. In order to keep and make sure you're not breaking any regulations or policies you have in place, you need to have centralized access over all environments and make sure that you only provide developers with the right level of access. You need to have an easy way to migrate data from production into a pre- production or a QA environment where the developers can look at it and understand the impact of the production data and also access to the log. What we've seen is that if we allow developers to look at productions logs a lot of the issues that take months to solve in terms of "what is happening?" you get the logs, you download it and look at it and then there's something new. This dramatically speeds up the process of deploying these applications. Number four. Lack of motive. This is one of my favorites because it has a lot more to do with people than tools. What I've seen is a misalignment between what the development team is looking for and what they're measured on and what the operations teams are looking for and what they're measured on. If you look at this difference of motive what happens is the development team looks to deploy the new features, get the code to production as soon as possible. They can start working either on the next release or move on to a different project. The operations team wants to maintain the status quo. They want to maintain the server, guarantee the up time, make sure the server's performing, etc. There is always this tension between the two teams. If you remember that example I gave you of my previous job where we had odd weeks and even weeks? What ended up happening is there were two main responsible people for the operations team and one of them, the one that worked in the even week, was quite permissive. If there was something wrong with the deployment he would allow us to go in, analyze and fix so that we didn't have to roll back. The other one, at the first sign of something that was not going according to plan, would immediately force a roll back and force us to go back a couple of weeks later in the next available slot to deploy again. This is actually a very interesting problem and there's no simple way of solving it but making the staging process easier, faster and safer. If you are able to say to the operations people, "It's very unlikely that this will fail," and they trust you, then they will be much happier to get it done with you. Plus you can align priorities. You can say, "As I manage..." Mike, can you hear me okay? I was getting information there might be an issue.

Mike: You're fine.

Miguel: What we see is that if you measure operations people not only by the and the uptime and the SLAs of the servers but also by the number of releases that are deployed from development to production you actually encourage them to help the developers be successful. The other way around, if you measure developers on the uptime and availability of the production solution and at the same time they will be sure to do their homework really nicely to help the operations people be successful. It is a problem.

Mike: Miguel, we've got two different questions. One of them states the reality of the scenarios. It's really not a question. "If I understand correctly then, agile is driving this rapid release process, putting the pressure on operations which should result in smaller releases." The question is a little jumbled but I think they're confirming that agile and the need for smaller, safer releases and going faster can work. The second question, from Jim, asks if you've seen any SLAs or priority alignments that work? Do you have any advice for the audience on things they can do to align the two teams a little better?

Miguel: As I said, changing their APIs is definitely one of them. The other one I've seen is trying o get the operations people a little more ingrained in the project and application. What we've see that works is to ask the operations people to be part of the Scrum or be on the weekly or bi- weekly demos of the systems so they understand more, what the system is. They understand, what are the operational requirements in terms of integration, in terms of mail servers, firewalls, etc. In a very recent project we did in a biotech company we actually did this and we got the operations team involved from the beginning. When we released everything was flawless and nice. They said it was the first time I'm releasing an application that I understand what it does. When he said this it was definitely an eye opener. A lot of times the operations people are on the other side of the wall, something lands in their lap and they don't have any idea what they're doing, what they're releasing. The operations people want to know why we're developing that application and why it is vital that it is released very often. The last one, another favorite, is the graveyard shift. I can relate another story on this one. Again, in the financial institution I used to work for our releases were about every six months. We started doing all the extra work in the beginning tracing all the steps and finding all the roll back mechanisms. You would start on a Friday at 11 pm and 8 to 12 people would join, depending on who needed to change what, we'd hang out a little bit, discuss, talk about it and around midnight we would start our potential downtime window and we would deploy these applications. Of course, it was a very complicated process. This was a six month release and it took a long time, everything had to be changed and if we were lucky and the week was even, around maybe 4:00 A.M. to 5:00 A.M. we would be done, everything would be in production and tested and smoke tested and we could grab some breakfast and go to bed. If not, if the week was odd most likely we would have had to have done a rollback, everybody would be highly frustrated and no one would get home until 2 PM on Saturday. This is basically a very painful process not only for the development team but mainly for the operations team. If you think about it, they're deploying applications every week so they're doing this kind of party weekly and that takes a very high toll. How do we fix this? We talked about it already but mainly the problem fixes itself if you try to do smaller, incremental releases. That means that if you take the agile approach, you need to change a lot fewer parts of the application than you needed to change before. You are able to release more often and if you can, as I said, include the operations team and make sure the same person is always responsible for deploying that application then they'll better understand what that application does, what it needs in terms of the push and they become more self sufficient and development doesn't need to do that much more work. Last but not least, make sure the staging is easy, safe, fast. If you do that then you can do these deployments instead of on a downtime window in the middle of the night, you can do them during office hours. If you do it during office hours you have everybody in the office that you need to help you with that. Nobody needs to come in for extra time and potentially lose part of their weekend building and deploying these applications.

Mike: Miguel, an interesting question."Our environment is 7/24 and getting a downtime window for frequent patching is very difficult. Do you examples where agile deployment worked in similar environments?"

Miguel: Yes. As I said, it really depends on how often you release. What we've seen is that if you release often you're changing a very small piece of the application instead of changing it all. Your downtime is very uncommon. You probably get less downtime than you would get if you had to, for example, try it a few times and do a rollback because you are doing a bloated release. I'll give you an example. In another recent project for a travel company while it's in maintenance they're releasing two productions every two weeks. Again, we're talking about a 24/7, high availability application that is exposed to the outside world. I can tell you it's, www.fly.com. F-L- Y dot com. Every two weeks they release a new version to production. They do it during business hours and they have very little to no downtime because of these very small, incremental chunks.

Mike: Thanks, Miguel.

Miguel: Now I would like to warn you that I would like to speak a little bit about the OutSystems product and what we've done to address these issues. Release 7 of the platform is mainly focused on simplifying development operations. Our R&D engineers spent a lot of man hours looking at this problem, analyzing each one of these. That's where part of this identification of problems came from and we've created a tool called Lifetime. One of the things that I'd like to show you is in terms of Lifetime. Maybe before we go into Lifetime I can tell you that Lifetime is a tool that is embedded in another suite of tools that compose the Agile Platform. Although I'm mainly going to be talking about staging and how you stage applications from development to production, through QA, etc. you have a lot of other tools, service studio and integration studio and service center that allow you to do the development, the integration, the deployment and then manage all the change of these applications in an integrated way. That's what allows us to have Lifetime, which is a no review tool, over all the infrastructures. If you look at the screen what we actually have is a view in Lifetime of the environment, the infrastructure that we have, the multiple environments. We can see we have a development environment. It's actually running dot net in the SQL server. It's running in the Amazon cloud so you can see the little cloud there. This immediately tells the operations people that handles multiple pipelines of environments every day what is looking at and what this deployment is for. It can immediately see the environment health, look at performance reports and configurations. If you have the right permissions you can also have it available for developers. Again, providing the full view of what is going on in your pipeline. In this case, you can see the production environment is in house so it's no longer in the cloud. One tool to rule everything you have in your environment. You also have the visibility of your entire application portfolio. You can see here that we have the list of applications that are running in this infrastructure, in the multiple environments. You can see the versions of the ones in development and you can see that it differs. For example, Version 1.3 of expenses is in development. Version 1.1 of expenses is in QA and Version none, which means this application is not in production yet, is actually running in production. Very easily the operations people can look and the developers and even sometimes people at the higher levels. The manager wants to see how this is going, what is my application portfolio looking. They can look at this tool and create and see what is in production. Then if they want to do a staging it's very easy. They click on' deploy' and it will allow them to publish applications from one environment to the other. Now I'd like to instead of showing you PowerPoint slides, show you a real life scenario. Let me share my desktop here. I'm going to log in as a developer and log into Lifetime. In this case, we only have a development in the production environment, not a typical case but to simplify the demo I'm going to do it this way. I can see all the applications I have access to. I can see order management which is the application I've been developing and I can see we have Version 7.0 in production and we have Version 7.0+, which means I've done a couple of developments on it. What I want to do is drill down into a little more detail, see all the changes. This one has a little "+" which means it's new. What I'm going to do is click the deploy. Remember, I'm a developer and I'm looking at everything that I need here and I'm clicking 'deploy'. As I click "deploy', the Lifetime tool does the automatic impact analysis across the multiple environments I'm looking at and it tells me I might want to tag and deploy Version 7.1 of Order Management but it also tells me that there are some problems with it. There are other modules that depend on it. I can click here and see what's going on. There's an application called "Mobile Catalog" that is in production. If I deploy the new version of Order Management I am going to create an incompatibility that might generate some downtime. If I drill down a little further I can click on this equal sign and it tells me there is a difference in the versions that were published. There's an equal sign again and if I click, it actually launches Service Studio, which I need to share, and because I've been granted permission to look at this application both in development and production I get the exact comparison between what is different between development, on this side, and production. I can see that only thing that is different is a field that was added to the database called "Supplier ID" which was added by me yesterday. This is the only thing that is different. Looking at regulated environments this is one of the most powerful things you can do. I say, "Okay. If that is the only different then I'm okay with deploying this." I tag and deploy Version 5.1. I can also see that Sales Report also depends on Order Management but in this case Version 7.0 is compatible. If it is compatible I don't need to deploy Version 0.8, I can just redeploy Version 0.7 to make sure all the references are okay. I click "confirm deployment" and it's going to take me to my deployment page. I can paste in my release notes, I did some data model changes, I integrated with supplier portal, I solved some issues. You can see because I'm a developer I cannot deploy because, of course, segregation of roles: developers cannot deploy applications to production but I can save this plan. I've traced all my steps, I know what I changed, I'm the most competent guy to decide what needs to be published and now I hand it over to my operations people. Now I'm logging out as a developer and log in as operations.

Mike: Miguel, can I interrupt? While you're logging in I've got a question from Jim. He asks, "Can Lifetime bring non-Agile Platform apps under control?" I'd like to extend that because most of these apps are complex; you talked about this earlier, and they also have integration, not only can it bring non-Agile Platform apps under control but what about application integrations? Can it help you understand the impact there and bring that under control as well?

Miguel: The answer to the first question is unfortunately, no. Lifetime is a tool that is fully integrated with the Agile Platform in a "best of breed" way. The fact that we use visual modeling to create this application is what really allows us to track all the dependents and the objects that are dependent on each other in order to do this impact analysis, not only at the one server level, but also across multiple servers in the infrastructure. The truth is though, in terms of integration, it does allow you to do that. Lifetime knows and looks and sees if an integration is outdated or deprecated. For example, if you use web services it looks at the WSEL, the web service description language of the service you're looking for, and if there's something that is outdated it will let you know as you publish your application. You can then go immediately and act upon it. So, no, unfortunately to the first part of the question. Yes, to the second part in which it actually allows you to control not only data, code, process but also integrations with other systems. Now, I have a button here that says "Plan". Clicking on it, I can see the plan that Sean Lauper created. I want to deploy Order Management Version 7.1, Mobile Catalog Version 5.1 and redeploy Version 0.7 of Sales Report. I see all the release notes, everything is fine. I can go ahead and click "deploy". The server now is going to automatically note all the differences between development and production. It's going to take all the models, make sure we understand the difference, we redeploy the grid and the C sharp code, or the Java code, to the production environment and then in a hot deploy way deploy that into production. While this is happening I want to show you one more thing that has to do with user access, which is important. I'm an operations person, I can go in and define roles across the multiple environments. If I look at the developer role, I see that developers can change and deploy applications in development but they can only list and see what is in production. For example, every developer that I give permission to, and assign this profile, will have this permission in there. In addition, I can do a little bit more granular control. I can look at Sean Lauper and you can see Sean Lauper not only has all the permissions for the developer role but he actually has special permissions. He's responsible for Order Management and Mobile Catalog. You can see the code differences between production and QA is because he was able to open and re-use the Mobile Catalog. You can turn these dials up and down and you can bring it back down to list version but in this case you're giving one specific senior developer the ability to do the impact analysis and simplify the whole process. This is still running but as this ends, you will see that all the deployment is going to be done and moved from development to production, the new version is there and you'll have Version 7.1 of Order Management in production. Let me go back to my slides here. In a nutshell, Lifetime allows you full visibility over all your environments so you understand what is going on in terms of the environment, what is your application portfolio, which versions are where. It allows you to centralize the security information for all the people managing that environment so you can easily provide certain developers with access to what they need to plan their deployment. It provides total control of your application portfolio and it makes sure that deploying applications is easy, fast and safe thus removing all the headaches we had with bloated releases of highly complex applications, and a lot more. I invite you to take a look or see a more in depth demo of what Lifetime does and also what the Agile Platform does.

Mike: Miguel, thank you very much. We've got one more question that just popped in. This final question is from Jim again and he asks, "How does Lifetime handle synchronizing test data across multiple apps?" Good question.

Miguel: It is a good question. Lifetime does not specifically migrate production data back to pre-production. There are other ways of doing it but Lifetime does not handle it specifically. The developer is the one that knows what needs to be changed in the database, if you need to do an update script to the data, etc. What happens is the platform allows you to execute batch jobs and schedule these for just after deployment. Although the operations person clicks the button, whenever the applications gets fully deployed into production the platform itself will call a batch and the synchronous job that is going to upgrade the data and then get the data up to date to the new data model if need be. Again, we're putting the responsibility where it needs to be, on the developer, and that is also automatically moved forward as the code gets deployed.

Mike: Thanks, Miguel. I want to thank everyone for their participation. This is the final session in our productivity webinar series. You can get all the information at this url. There's one landing page where you can get all the recorded webinars, supporting content, etc. so please share that if it makes sense. Take advantage of it if it will help you as you work on trying to embrace and improve productivity within your own organization. I also invite you to take advantage of Agile Platform 7.0. It's free and you can download it at OutSystems.com. That concludes today's webinar. Thank you very much.

contact pricing