Response Time Testing – Methodology Madness

Viewing 6 posts - 1 through 6 (of 6 total)

Buying a monitor? Please refer to this post before purchasing.
New user? Register here.


  • Author
    Posts
  • #66370
    PCM2

      I recently received an excellent question on my YouTube channel from somebody rightly very confused by the different messages they were getting on various models from different reviewers when it comes to response times. In particular, the way certain sources will convey a monitor as “really fast” whereas others might suggest it’s actually quite slow. The question was asked on our video review of the Corsair XENEON 32QHD165 – a model we recently reviewed and that has also been reviewed by various other sources. But the question was more general:

      “Very confused by all the reviews out there. Rtings and hardware unboxed get wildly different results. Just hard to know who to believe. I think your reviews are the best because you show real life testing etc.”

      When reading the points below, be aware that I am a great fan of Hardware Unboxed (HUB) in general and closely follow their work. I really like Tim and Steve’s style and dedication, too. But I can’t say I’m a fan of their new monitor test methodology. I also find RTINGS a useful reference, but as with HUB I know how to balance and interpret the response time figures and I wouldn’t expect the casual observer to do this or necessarily understand how to.

      Hardware Unboxed use an extremely stringent test methodology which will heavily penalise for areas of response performance which have very little visual impact. Ironically, their gamma correction (new methodology) attempted to do the opposite and more realistically represent response performance. They end up putting people off even trying monitors which in all likelihood they will find visually excellent for pixel responsiveness. And that around a year ago they would’ve praised for excellent responsiveness. They’re also far too lenient when it comes to overshoot in my view as they’d recommend settings for monitors such as the 32QHD165 which provide very clear overshoot in practice. I can almost guarantee very few would want to use ‘Fastest’ on this monitor over ‘Fast’ at 165Hz, for example. If they end up really praising a monitor for fast pixel responses, though, be assured and have absolutely no doubt that it will be very strong indeed in that respect. Some technical background.

      RTINGS look at both the traditional parts of the pixel transition as well as a full 0 – 100%. The link above explains this a bit more. The part of RTINGS response time measurements most people focus on (and they use for their own comparisons) is the 10 – 90% or 90% – 10% or “Rise / Fall” data. This is too lenient really and will show some monitors as being very fast where in practice you will notice some weaknesses by eye. For some reason their methodology seems extremely sensitive to overshoot and they’ll often disregard settings which would actually be quite tolerable for most users in terms of overshoot. This is all the opposite of HUB’s current methodology, really.

      TFT Central uses a gamma-corrected methodology as well as their ‘traditional’ (10 – 90% / 90% – 10%) methodology. I really like the balance they strike with their new methodology. It’s much less lenient than the ‘old method’ (i.e. industry standard method used for many years) but doesn’t go to the extremes that HUB does. I find their data most closely aligns with my own thoughts, feelings and visual data (pursuit photographs). It’s a good middle ground between HUB and RTINGS in my view.

      I assess things very differently. Visual assessment which is documented or recorded is always carried out after the monitor has been used for several days beforehand and has been running for at least 2 hours (usually longer). So it’s properly ‘warmed up’ and should be performing at its best. I’m not saying others do not do this of course, just pointing this out as a potential variable to consider. And something I definitely take into account during testing. There are two main types of ‘visual assessment’ I use on a monitor with data actively collected when it’s primed and ready:

      1) Specific tests which isolate transitions. Pursuit photographs shared in reviews which use Test UFO’s Ghosting Test, at various refresh rates and overdrive settings. I’ve discussed my testing conditions, camera and settings used with Mark Rejhon of Blur Busters who created the test to make sure things are done properly. These pursuit photographs very accurately portray what you’d see in a given moment in time when running that test in person on the monitor (using defaults for the test). So they can absolutely be trusted as a reliable indication of performance and cross-compared with other models we’ve reviewed. I reinforce this with observations (but don’t record) using A5hun (Aperture Grille)’s Frog Pursuit or ‘Smooth Frog’ (current latest version) which isolates some great transitions which many monitors will struggle with a bit. Or potentially a lot, for that matter. The white frog against the medium or light grey background is often very unforgiving and I’ll find my eyes drawn to similar weaknesses in game scenes.

      2) Observations in games such as the latest Battlefield and Tomb Raider titles (and movies in written reviews as a form of lower frame rate testing). I actually play a greater variety of content than that and don’t always actively review using all of the content. I really like to just use a monitor as normally as possible, including a good dose of gaming, for 2-3 weeks and documenting my findings as I go. With games like this, you’ll encounter a huge array of transitions. There are potentially pixel transitions with 256 shade levels to consider in and no reviewer will be considering the full range of transitions. I don’t expect them to of course, but sometimes I might observe certain weaknesses such as overshoot during quite specific transitions. These may be missed off if they don’t happen to coincide with the values another reviewer tests. Usually the most obnoxious issues will be covered with the common shade levels tested, however.

      So the testing response time measurement methodologies used by different websites differ tremendously. All methods have some merit, but I do feel some have gone to extremes to collect “data for the sake of data” and can ultimately mislead people. I see my own testing as complimentary to other testing as I do things very differently. I always seek user feedback for models I review to make sure I’m providing the right sort of balance in my reviews in this area. It’s why I used the Gigabyte M27Q as a reference in the 32QHD165 review and referred to it as a model which “offers a level of responsiveness many are comfortable with”. The Corsair was quite a bit stronger than that baseline, too. Ultimately, sensitivity to weaknesses in pixel responsiveness and overshoot varies. But I am frequently reassured and frankly humbled by this feedback – it seems I strike a good balance here and accurately convey what people can expect from a monitor in practice. I’d recommend using all the data available to you from all these sources to help build up a picture, but don’t be afraid to question those potentially misleading extremes.

      #66374
      Gigamike

        Yeah i favor your tests more over HUB. Timings themselves are so difficult to quantify exactly how it looks when used in practice. Thats where i wish theyd do a UFO ghosting/actual filming of the motion while in use. Keep up the good work!

        #66384
        sayhejcu

          I really like Tim but seriously he is blind to overshoot.

          My opinion: unless it’s as fast as oled, little bit of response times advantage mean nothing. Important part is having artifact free motion. Overshoot destroys it. Which makes ufo pictures more reliable BUT nowadays static response times mean nothing. It should be tested with vrr from 40 to max fps. This is where a5hun’s frog test beats ufotest.

          #66386
          PCM2

            I know what you’re trying to say there – that some of the minutia of response time differences between some models are not as important as overshoot. But pixel responsiveness certainly is still important at the same time. Otherwise you could say pretty much any modern monitor offers a ‘single overdrive experience’ as long as the lowest pixel overdrive setting has low overshoot at all refresh rates (this is quite often the case). Sensitivity to both overshoot and weaknesses in pixel response time varies, but there’s definitely a balance to be struck when trying to recommend or not recommend a product based on that. And people tend to be much more tolerant to what amount to rather slight pixel response time weaknesses than potentially rather visible and eye-catching overshoot.

            The seamless refresh rate changes with the slider on Frog Pursuit is another reason I like using it for observations. Though I suggested to a5hun he integrates an automatic refresh rate cycle feature a bit like Nvidia’s Pendulum demo. Having the movement speed tied to the refresh rate is also quite awkward (a5hun noted this), though it still does the job for analysing refresh rates at which overshoot can become overbearing. I then focus in on these refresh rate ranges during the in-game testing in videos and will discuss them in a similar VRR context in the written reviews. Most of the time 60Hz is a good testing point for overshoot as if it’s quite strong for a particular overdrive setting there, it tends to be that overshoot becomes quite visible <100Hz or so. Usually increasingly so as refresh rate lowers, until the LFC boundary is crossed and then it suddenly switches to a lower level (as the refresh rate suddenly shoots up). It's extremely unlikely to be mild to moderate at 60Hz but strong closer to the LFC boundary, though. You can't be too exact as everyone has a different overshoot tolerance anyway.

            #66397
            dynastes

              Thank you for this post. Without you mentioning it recently in relation to reviews of the FI27Q-X, I would never have known that HUB’s testing methodology is “too” stringent. In fact, I considered it the ultimate methodology among Youtubers (which it still might be, given that most monitor reviews on the platform seem sadly misinformed, with color gamut being considered equal to color accuracy and so forth), the be-all and end-all for this kind of thing so to speak.

              Have you offered this opinion to Tim? I can’t imagine he would not be open to discussing it. His methodology should be adjusted if it meant a more accurate representation of actual picture quality. Also, he should perhaps add some “real-world” testing, like the one you are doing.

              #66399
              PCM2

                It has been mentioned on multiple forums now and also Reddit, with input from Simon Baker of TFT Central as well. So it is certainly out there in the open being discussed. I’m sure Tim will become aware of this discussion in time, but if not I’m certainly open to mentioning it directly. I know he ultimately wants what’s best for his users and wants them to make an informed opinion on a product, so he should certainly be open to adapting things to achieve that goal. 🙂

              Viewing 6 posts - 1 through 6 (of 6 total)
              • You must be logged in to reply to this topic.