Existing guidelines with respect to non-inferiority and equivalence
Sebastien Marque, Head of Biometrics, Danone Research, France
A clinical perpective on setting equivalence limits
Tjoeke Tan, MD. Independent consultant, The Netherlands

When developing new chemical entities, the main objective is generally to demonstrate a clinically significant (positive) effect compared to placebo or – in rare cases – compared to a standard product. When dealing with second entries or generic formulations, the topic of essential similarity or equivalence becomes an issue. When the reference product exerts a clinically relevant effect by means of (measurable) plasma concentrations, it may be sufficient to demonstrate that the generic product produces similar PK- profiles as the reference product. Assuming that similar plasma concentrations have the same clinical effect, the generic product will be declared essentially similar to the reference product or equivalent to the reference product and hence interchangeable with the reference product. The plasma concentrations of generic and reference product are allowed to “differ” form each other within certain limits which is in fact the equivalence range. This (bio)equivalence range has been set at 080 – 1.25 and has been described in a number of Notes for Guidance.

When dealing with products that exert their effect via local application and for which plasma levels do not matter to produce a (positive) clinical effect, essential similarity cannot be demonstrated as described above. Several Notes for Guidance describe how to deal with this kind of products. From these Notes for Guidance it is clear that clinical studies showing therapeutic equivalence are mandatory. Unfortunately, the Notes for Guidance do not provide clear guidance how to set therpauetic equivalence ranges except that the 0.80 – 1.25 rule is not acceptable. An US Guideline provides us with a “rule of thumb” to take one third to a half of the effect size shown in clinical trials between reference product and placebo. Unfortunately, such an equivalence range is often considered as too wide by most EU regulatory authorities.

Setting a therapeutic equivalence remains difficult because it is often based on personal opinions of clinicians and the lack of clear guidance from regulatory bodies. In many cases, an equivalence range is constructed based on personal opinions, previous experience, (not very clear) advice from regulatory authorities and what others have done in the past and whether this has been acceptable to regulatory authorities.


Statistical aspects of equivalence margin specification
Towards a Statistical Guidance for Choosing a Measure of Distance and the Margin(s) in Equivalence/Noninferiority Trials

The better part of the considerations apply both to noninferiority trials and studies aiming to establish equivalence in the strict, i.e. two-sided sense. It is argued that a careful, statistically meaningful specification of the population characteristics (parameters) selected for measuring the “distance” between the distributions under comparison, is at least as important as a well-grounded specification of the margin to be applied to a given distance measure. An issue of key importance in selecting a distance measure is that this should be done in a way taking into account the structure of the parameter spaces in which the basic distributional parameters take their values. It is shown that  the difference transform fails to satisfy this requirement whenever the parameters under comparison are both unknown and, like binomial proportions, have a bounded range.
The second core part of the talk is devoted to establishing a fairly comprehensive guidance for choosing the equivalence margin in a noninferiority or equivalence trial with symmetric specification of the equivalence range. The basic idea is to reduce a comparatively wide range of apparently unrelated settings to the very simple case of comparing just two probabilities one of which takes its value around the middle of the unit interval. For distinguishing between gross and still acceptable differences between two parameters of this structure, it is not really difficult to find a consensus.

Problems with setting equivalence limits
Without doubt the concept of non-inferiority is sensible.  Examples of truly wanting to show equivalence maybe less common, although there are some justifiable reasons.  The problem seems to revolve around a margin of inferiority and it’s pre-specification.  Different groups (notably different regulatory agencies) seem to take different approaches; undoubtedly not all regulators agree with a single (even regional) position.  Not all regulators understand their own regulatory guidance and, without doubt, many people in pharmaceutical companies have either not read the guidance, or don’t understand it.  But is a margin really sensible to set a priori?  We don’t do it for superiority.  In that situation we test the rather silly null hypothesis of no benefit – but we certainly don’t make regulatory or treatment decisions based on rejecting it.  We make such decisions based on looking at how big the (positive) effect is and – crucially – how big the negative/adverse effect is.  A given size of benefit will only ever be acceptable if it outweighs the risks.  So, too, with non-inferiority.  Whatever margin we set a priori will be based on assumptions about risks but ultimately the conclusion from the trial will have to be based on efficacy and safety.  So why not just do the trial, estimate the sizes of benefits and harms, then decide how to interpret the results?

But the size of the margin is only a small part of the problem…


