Running web CRO at scale: ideation, documentation, and hard truths.
Misho Chirikashvili, Web Optimization Manager at Pipedrive, breaks down what it takes to run a mature CRO function at scale - from ideation systems and documentation practices to the hard truths most teams avoid talking about.
Ideas come from everywhere - but need grounding
Good test ideas can come from research, usability studies, benchmarking, or gut instinct. But the best ones are always tied to specific, validated customer problems rather than subjective design preferences.
Scale velocity requires infrastructure first
Before running more tests, build the system: a structured backlog, a clear pipeline, and dedicated resource support. Velocity without infrastructure creates noise, not signal.
Losing tests are wins in disguise
A -20% result tells you more than a flat test. It proves the hypothesis was wrong, validates the need for experimentation, and frames the outcome as 'dodging a bullet' rather than failure.
Chase learnings, not win rates
Optimizing for win rate leads to safe, incremental tests that teach you nothing. Programs that optimize for learnings run bolder experiments and produce smarter decisions over time.
Reducing content doesn't always simplify
B2B buyers in evaluation mode will read what's there if it's useful. Stripping context often causes confusion - not clarity. Simplification should be driven by research, not subjective minimalism.
Low traffic is not a CRO blocker
Lower your confidence threshold to 85-90%, focus on secondary funnel metrics, and plan to retest. If your waiting period is six months or more per test, qualitative research may produce more value than pure A/B testing.
Hey, welcome to the first episode of Web Unpacked. The idea behind Web Unpacked is quite simple - we want to discuss everything around website conversion, SEO, AI, AEO, tech stacks, QA, everything that's related to websites. No buzzwords, no fluff, just honest conversation with people who build, maintain, and improve the web every day. Today's guest is Misho, who is Web Optimization Manager at Pipedrive - basically the person behind day-to-day experimentation on a site with millions of visits. We'll talk about how we validate ideas, how we turn traffic into actual business impact. What makes this conversation valuable and straightforward is that Misho works at the intersection of experimentation, documentation, and research. He's seen what actually works and what really doesn't. Misho, for those who don't know you, let's set the scene. Can you walk through your role at Pipedrive and what your day-to-day looks like?
Yeah, for sure. First of all, I'm very excited for this talk and thanks for inviting me here. Basically, as you mentioned, I'm owning the experimentation on our web platform - and it includes everything: research, documentation, task delegation, product management. But of course the main topic is the experimentation and the tests that we run. I'm responsible for coming up with them and also seeing them through the whole testing life cycle - analyzing them, coming up with conclusions, understanding what the outcome is. Throughout this whole journey I do it with all the necessary and dedicated stakeholders for each stage. I'm basically behind the scenes of all the things included in the experimentation but mainly focused on web.
That's a lot to do. When people talk about experimentation, the conversation usually goes to frameworks, scoring models, how you're measuring stuff. But in my opinion everything actually starts with the ideas. Could you walk through how you're usually getting different good ideas and how you're approaching that topic?
Yes. So before I go into too much detail about each stage, I just want to make sure we're not trying to push some specific ideas or suggestions - some things might work for me, some things might not work for others. But basically about the ideas - there are many sources. It can be research, it can be customer journeys, it can be some benchmarking which is also quite useful. The experimentation ideation is a very broad topic. The sources for ideas can be anything: research, benchmarking, visitor journeys, feedback that you receive, even some gut feeling testing which is also done very frequently. But the tricky part is finding the balance between gut-feel experiments and grounding them on actual facts and customer problems. What I do most often is collect some ideas. If you have ideas that you think are good enough to test - could be subjective, could be from benchmarking - you should collect them. Don't push them too much right away. But the importance is that once you have a specific set of problems you want to solve - opportunities you've figured out through research, usability studies, or interviews - you can go back to that backlog of ideas and see if you already have solutions matching those problems. Or you hold a new ideation session with the right stakeholders to come up with new solutions. And usually that works best for the start of the test life cycle.
Yeah, it probably matters that you find all the different ways to get that customer feedback - their problems, their issues, their needs. There are so many different sources. And quite important: sometimes you don't have the resources to do all that research. In that case, the best place is to just start testing even if it's not very grounded on solid reasoning. Start testing but with an expectation of learning, not having winners. You get some feedback, you iterate. One of the most underestimated things are the learnings you receive from experiments. Plus you can get a little scrappy with AI - if you find a couple of customer problems, use AI to figure out what good solutions for those problems might look like. Sometimes limitations can be a real blessing.
Yes, exactly. And sometimes we take AI with a bit of grain of salt, but it really helps. It can be a nice assistant to bounce different things off and find good solutions for different problems. But you probably shouldn't depend on it too much.
All right, let's move into scaling. Every team and every company wants to run more tests and run them faster, but very few actually manage to do it. Because the more tests you start running, the more noise there will be, the more documentation there will be. And quite many teams don't understand how much admin work is behind a lot of testing. So what's your take - how do you handle increasing testing velocity while maintaining quality?
That's exactly right. Once you move to that stage where you're thinking "okay, we can do more tests, we can increase velocity" - the first thing you should start thinking about is having the correct setup that will allow you to do so. What I mean is: tools that you use. You don't need to have all the fancy tooling, but you need to have some system and structure behind it. When you have new ideas, you have a place to store them. When you have ideas that can move to the next stage, you have the right way to create a pipeline for it. You need proper resource support - whether that's design, engineering, analytics. Once you have good setup, you can start thinking about more ideas. You can think about what traffic you have and how you can leverage it to be as effective as possible. Sometimes traffic is too low. Then you need to make some compromises - for example, not waiting for scientifically proven results after two or three months of testing lifetime, but making some compromises to be quicker. And especially when you're testing quite a lot, you have to always consider that it will never be perfect. There are always things in the process that aren't ideal. It's constant iteration and you have to keep improving.
How do you feel about testing with low traffic? Because that's a big issue for many companies. They just don't have enough traffic to have meaningful A/B tests. How should they approach CRO in low traffic environments?
It can be very different depending on where you're doing the CRO. For web, especially SaaS companies, the main metric is usually signups or trial starts. But when you might not be getting the right results quickly, you might start looking at secondary metrics: how many people actually started the signup, how many moved to the next stage of the funnel. That gives a good indication of where the test is going. Instead of 95% confidence level, you can have 90% or even 85%. It gives a bit of leeway. But also gives you the incentive that once you finalize the experiment, it doesn't mean it ends there - after some time you should retest it and follow up with new iterations to validate those results. And for companies who really don't have any traffic, there's always research. Always get some customer feedback, always understand the problems. Just do your best to figure out the low-hanging fruit. Even if you can't properly test, you can still get different indications. If the waiting period for a test is six months or something like that, data-driven methodologies don't really make sense. It might not be for you. But if you have a bit better traffic than that, there's definitely a way.
Let me ask about losing tests - what's your take on those? Because as you know, most of your tests might lose. You can't really control the win rate as much as you want. What's your approach?
Unpopular opinion, but I love losing tests. Not always of course - there should be some contrast between winners and losers. But I would much rather have a minus 20% experiment than a flat one. Because then you know something is not working. Then you can have this mindset of "we dodged a bullet" rather than "we lost an experiment." We dodged the bullet that we would otherwise probably have implemented - because with each experiment you have some reasoning that you're doing that experiment. You think it will work. When it's a loser, it gives you another validation of why you need experimentation. There's probably a good reason behind why it lost, which means you have a new learning. And you can probably turn it 180 degrees and make it a win if possible. Also - people chasing win rates. People should be chasing tests that have learnings instead of only win rates. Because then you get into this mindset of "I don't want to test this because it might be a loser and I need more winners." That limits you from going bold sometimes and only doing safe bets. You should have the structure and the system where with each experiment you can learn something regardless of the result.
I want to wrap this episode with a small tradition - a section called "things nobody says out loud." Because every team has those wet truths - things everybody thinks about but nobody says out loud. What's something in experimentation that everybody kind of secretly knows but nobody wants to say publicly?
What I don't like is depending too much on prioritization frameworks - not just overall prioritization, but the frameworks usually that are quite subjective. Those can be really good ways to give you some kind of direction, but it should still be you who decides what you need to test, because you have much more context over the strategy, over what you're testing, over which pages are empty and you know you need to push there more. ICE frameworks, RICE frameworks are good ones, but not as good as people probably think. You have to figure out something that works for you, your business, your team, your resources. Just bucket things into the correct way so you know which test to take next.
What's one thing someone running CRO at a SaaS company should absolutely stop doing?
Only testing things you think are good. Like with gut feeling - it looks cool, it will work for sure. Especially when we see nice elements or designs in other websites when doing some referencing - essentially subjective testing. It doesn't work. Sometimes it might, but doing that always isn't the way to go. And it gives you false confidence if one of your gut feeling tests was a winning one. I made that mistake in the early days too.
What's one belief about experimentation you changed your mind about this year?
It was actually this year when we started really pushing running several tests at the same time on each page. It's a bit controversial, but if done right, you still can do that. For example, you shouldn't run two experiments that are overlapping the same element. But one experiment can be in the hero section and another somewhere down below. The uplift you receive for that experiment and the performance you see is still relative to original and the variation. It's quite interesting because I was always with the mindset of very traditional experimentation - one test at a time, nothing more, because it will pollute the data. Sometimes it might, and probably if you do that, you might want to not end the experiment too early. You need a good enough sample size to make sure each of the variations are treated equally.
What do you see teams overcomplicating when it comes to CRO?
There are many things people are going into way too much. Sometimes we might not need as many meetings as we populate our calendars with for casual experimentation. If you find a way to have one meeting per week instead of every day to go over the stuff the team is working on, and if everyone has their time to actually work on those things - it will be much more effective than having an hour-long standup every day that doesn't really do much and keeps people away from their focus. As a CRO practitioner you need dedicated time to think about ideas and learnings - all of that needs mental capacity. A lot of things can be done async. For ideations, for example, you can write down ideas based on introduced problems in your own time, but then gather around just to discuss them and think about the action steps.
Last question - what's one test outcome you saw recently that surprised you?
Probably when we tried to simplify things on our websites. We think some elements are way too overcrowded with information - and that's an example of subjective thinking. Then we try to scrape away some elements, reduce the copy, reduce the things we show, and it doesn't work. There have been so many cases of that. This makes you realize that information is good if it's done properly. If it gives proper context about what it is about, reducing it might give people some confusion. If it goes to the context and it's useful information, users might not mind being loaded with information if they get it fast and from one place. Especially in B2B, you can assume that people interested in your product will read what you have. They don't just scan and decide. If someone really wants to decide, they will read what you've written.
Cool. That's it for today's episode of Web Unpacked. Misho, thank you so much for taking time and sharing all of this with us. I'm sure people loved these insights. And if you liked the episode, don't forget to subscribe and see you on the next episode.
Thank you.
How do you generate good A/B test ideas at scale?
Build a backlog of ideas from multiple sources - research sessions, customer interviews, usability studies, benchmarking against competitors - and separate the ideation phase from the prioritization phase. When you identify a set of specific customer problems, go back to the backlog to see if you already have solutions that match. If not, hold a focused ideation session with the right stakeholders.
What's the right approach to CRO when you have low traffic?
Lower your confidence threshold from 95% to 85-90%, focus on secondary metrics (funnel progression, form starts) that accumulate faster than primary conversions, and plan to retest and iterate rather than treat one test as final. If traffic is extremely low with a waiting period of six months or more per test, qualitative research and best-judgment changes become more valuable than pure data-driven A/B testing.
Is it OK to run multiple A/B tests on the same page at the same time?
Yes, if the tests don't overlap on the same elements. One test can run in the hero section and another lower on the page simultaneously. The key is ensuring both tests have a sufficient sample size before you conclude, so each variant gets equal exposure. Running non-overlapping concurrent tests is a reasonable way to increase velocity without polluting results.
Why should CRO teams stop chasing win rates?
Because optimizing for win rate leads teams to run only 'safe' tests that are unlikely to lose - which means small, incremental changes. A losing test that produces a clear learning is more valuable than a flat win that teaches you nothing. The goal is to make decisions faster and smarter over time, and that requires tolerating losses.
What's the most underrated thing in a CRO program?
The learnings from losing tests. Most teams report on wins but file away losses. Treating every result - positive or negative - as a learning that informs the next hypothesis is what separates mature CRO programs from ones that run tests and move on.
Misho Chirikashvili runs web optimization at Pipedrive, one of Europe's leading SaaS CRMs. He's deep in the mechanics of running CRO programs at scale - the systems, the process, and the things nobody talks about.