MathsWatch spaced-learning revision programme- Edexcel 1MA0

I’ve put together a revision programme based on MathsWatch for the year 11 cohort of 2015-16 studying Edexcel 1MA0 GCSE maths. It is based on spaced-learning principles to maximise retention. This could be used as an additional revision programme to complement weekly practice papers, which are essential in my opinion. It will ensure students systematically visit all topics on the GCSE 3 separate times during year 11 revision.

Download the D to C revision programme

Download the C to A* revision programme

MathsWatch Year 11 revision programme D to C

1st revision– students watch the MathsWatch full video lesson and make notes on this template.

Also, download an example  you can show them of what good notes look like.

MathsWatch revision programme template- example of good notes


2nd & 3rd revision– students watch the MathsWatch ‘1 minute maths’ video and complete the questions in the video.

Little and often! A bit of structure for the well-intentioned students who need it. :-)

*Note to be aware of… Clip 60… We teach Gelosia multiplication at Wyvern College, rather than long multiplication. The pdfs advise students to look for a clip elsewhere on this topic.

Numeracy Ninjas



Today is a rather emotional day for me. The last few years have been an amazing journey learning about effective maths teaching. Steps along the way have included: The problem with levels- gaps in basic numeracy skills identified by rigorous diagnostic testingForgetting is necessary for learning, desirable difficulties and the need to dissociate learning and performanceGoing SOLO on the journey towards deep learning,How do we make John Hattie’s “Visible Learning” work in maths? and, of course, You’ve never seen the GCSE Maths curriculum like this before…

Today is a milestone where I’m launching Numeracy Ninjas, a free KS3 numeracy intervention that puts all of my CPD learning from the last 3 years into practice! It incorporates all of the curriculum mapping work with a schedule that maximises retention over time. Best of all, I’m making it completely free for any school to use!

Ninja Character GOLD

Visit to meet the Numeracy Ninjas!

The problem with levels- gaps in basic numeracy skills identified by rigorous diagnostic testing

I have felt for a long time that one of the disadvantages of the current levelling system is that it encourages teachers to constantly teach students mathematical concepts and ideas only at levels equal to or above that which they have recently scored on an assessment.  The data-focused, level-centric system only rewards teachers if their students can  score marks on higher-level content; there is no explicit incentive for filling gaps in students’ knowledge at lower levels.

Good maths teachers know the importance of their students having a strong knowledge in the foundations of the subject and the long-term benefit of plugging any gaps students have. This year I wanted to be much more systematic in identifying any key gaps students had in their knowledge of the foundations of secondary maths as soon as possible when they arrived in year 7.

What we did

We diagnostically tested every year 7 student in three areas: mental numeracy calculation strategies, timestables and key nodes. For each student, we assessed 30 mental calculation strategies such as: number bonds, reversing an addition sum to make it easier and counting from the smallest number to the largest in subtraction etc. We assessed 30 of the timestables.  The top 30 key nodes topics identified from my previous work on understanding the most important topics for students to master prior to studying GCSE maths were also assessed. In total we collected 90 data points for each of 203 year 7 students shortly after they arrived with us in September 2014.

Whole network no labels low res

The GCSE Mathematics curriculum showing the prior-knowledge links between topics. The most connected topics are the ‘key nodes’.

The testing was carried out in a single 50 minute lesson using the QuickKey app available in the App Store.  Students were shown a PowerPoint presentation that contained the 90 questions, each with a specified time limit. Teachers click ‘start’ on the presentation and students had to identify the correct answer for each question from five multiple choice answers. We ensured ‘distractor’ answers were placed alongside the correct answer in each question based on common misconceptions. Students each had a grid on a piece of A3 paper  in which they had to colour in the correct circle corresponding to their chosen answer for each question. After the assessment the test papers were scanned using the QuickKey app on my iPhone which automatically marked them and recorded the whole cohort’s results in a single spreadsheet file. The scanning took just 3 hours and we obtained over 18,000 data points for the whole cohort from just a 50 minute lesson!

Example of a QuickKey ticket that the students filled in to answer the questions. These were scanned using an iPhone running the QuickKey app that marked the tests and put the results into a spreadsheet

Example of a QuickKey ticket that the students filled in to answer the questions. These were scanned using an iPhone running the QuickKey app that marked the tests and put the results into a spreadsheet

We tested the following number of students in each KS2 sub-level group:

3.0- 4, 3.3- 5, 3.7- 9, 4.0- 23, 4.3- 28, 4.7- 32, 5.0- 30, 5.3- 35, 5.7- 16, 6.0- 21

The results

This graph shows the performance of different KS2 sub-level groups in the three main areas assessed.


The Pearson correlation coefficient for the mental strategies, timestables and key nodes assessments versus KS2 level were 0.72, 0.58 and 0.71 respectively.  These correlations were also checked against a KS3 SAT assessment sat by all students during the first half term with us and found to be almost identical. The weakness in the mental calculation strategies of level 3 students is clear to see. Level 4 students were far from being secure in many of the basic mental calculation strategies we would take for granted that they would know. For example, some did not automatically reverse an addition to make it easier, could not calculate a number bond to 100 or tell the time. It is important to state the timing of the questions was reasonably swift during the mental calculation strategies and timestables section of the assessment as we wanted to assess what students could do fluently through recall and fast strategies, rather than what they could do with written calculations if they had a lot of time. During the key nodes assessment we gave students a bit more time and allowed them to use written calculations, but again set the timing such that if they did not show good fluency in choosing and executing the correct strategy they would not have had time to answer the question. So when I claim that some students could not do number bonds to 100, I am saying that they could not do this mentally within approximately 10 seconds.  The thinking behind this assessment approach was that in order for these skills not to become barriers to learning and working-memory-consuming difficulties when studying more challenging topics on the GCSE course, we want to assess whether these skills are fluent and can be executed quickly, almost without thinking.

The results of the timestables assessment were better than I expected, particularly for the lower attaining KS2 students.  However, when looking into the data it was quite apparent how many of these students who knew their timestables, perhaps did not understand the concepts behind them. For example, they got the questions on understanding multiplication by its link to repeated addition wrong.

To delve a little deeper, the following diagrams show the mean average proportion of students within each KS2 sub-level group that got each of the 90 questions correct.

Mental numeracy calculation strategies

Mental numeracy strategies




Key nodes

Key nodes

There are many interesting interpretations and observations to be made from these results. I will leave it to you to delve into this as deeply as you wish. I think the diagrams summarise nicely the differences between what level 3, 4, 5 and 6 KS2 to students can do fluently across the topic of number. Broadly speaking: level 3 students have mastery of none of the three areas; level 4 students have some proficiency with mental calculation strategies and timestables, but not the key nodes; level 5 students have reasonably secure mental calculation strategies and timestables and have some proficiency with the key nodes; and level 6 students are broadly secure in mental calculation strategies, timestables and can do many of the key node topics already. It is an obvious statement, but with the strong correlation coefficients already cited, it would appear that what a student can do in number is a strong indicator for what they could do across the curriculum.

These results add further support to my belief about levels hiding gaps in the foundation knowledge of some students. A significant proportion of 4 students struggled to fluently identify number bonds to 100, a level 3 skill. Even many of the level 5 and 6 students did not answer the questions on understanding multiplication as repeated addition or division as the inverse of multiplication correctly. It goes to show- giving a ‘level 6 learning objectives’ sheet to a ‘level 5′ student is not good enough. Perhaps they could already do some level 6 topics, but may have gaps at levels 3 and 4. It must be personalised on a individual student basis.

Has growing up in a levels-based system, where teachers and students are only rewarded for achieving success on content at higher levels in the subject, resulted in oversight/ignorance of the gaps in the students’ foundation knowledge at lower levels? How much easier would learning the more advanced topics be for students if they had comprehensive fluency with these basic skills? The hidden gears need oiling once in a while.

To be clear, this is in no way a reflection on our excellent primary colleagues. They do a brilliant job. They are constrained by a single-minded, level-incentivised, high-stakes system, just like we are in secondary, and they act accordingly to meet the external pressures placed on them. We do the same in Year 11.

A change in mindset is required. If levels are going (and they are!) we must not replace it with a system that has the same flaws. I am certainly not suggesting that we shouldn’t teach students higher level content than they can currently attain; but this must not be just a single-minded focus either. From the diagnostic testing we did this year I have learned that KS3 lessons across the ability spectrum still require systematic, planned, regular practice in building students’ fluency in the foundation topics of number. Fluency (speed and accuracy) is a fitness, it is not binary. Even if you are 100% accurate, you can always be faster then you are at present. If two students can calculate number bonds to 100, but one of them takes five times as long to do it as the other, their learning of higher-level concepts will be all the more difficult for them later in their maths education. We must not fall into the trap of confusing instantaneous performance for retention and transfer – learning. I have written about this extensively before. Learning must not be seen as a checklist of visit-once objectives. Even the highest attaining students need to occassionally revisit some of these elementary topics for which their “fluency fitness” has fallen. A green cell in a spreadsheet indicating that they can do a skill today should not be taken as a proxy that they will be equally as fluent in this skill in six months’ time.

In our KS3 lessons at Wyvern College from September 2015 we will ensure that not only do we strive to raise students’ proficiency with higher-level concepts, but we will also provide short daily exercises that build students’ speed and accuracy in all three areas assessed in our diagnostic trial. It is with great excitement and anticipation that I am going to launch a five-minute-every-lesson, fluency-building product on Great Maths Teaching Ideas this summer. It will provide systematic, rigorous coverage of every topic assessed during our diagnostic trial. Watch out for that! As a consequence of what we learned from the diagnostic testing of our year 7 arrivals this year, I feel so passionately about the need for this product, that it will be made available free for all schools via this website in the next month or two.

Keep an eye out!

Update: the Powerpoint we used can be downloaded here.

Time-series jigsaw puzzle fun!


Teaching time-series graphs?

Get students to work in pairs. One solves a jigsaw puzzle. The other student records how many pieces in the jigsaw were solved every n seconds. Choose n appropriately.

Students then plot graphs of n.o. pieces solved vs time. Then ask them:

  1. Describe the shape of your graph. Why did it take this shape?
  2. What would the shape of the graph be for someone who is very good/ very bad at solving jigsaws
  3. Predict what would happen to the shape and position of the graph if you had to solve a larger/ smaller jigsaw puzzle. Test out your prediction!

AQA 90 problems to solve- perfect for 9-1 Maths GCSE


AQA have recently released their ’90 problems’ to support teachers in delivering the problem-solving element of the new 9-1 maths GCSE. It’s a rebranding of the 90 problems they released to support the teaching of Further Maths GCSE in 2008. There a number of issues with many of the diagrams on the newer version, so I’ve attached the 2008 version for your interest, here:  AQA 90 Problem solving questionsAnswers

They are excellent resources and perfect for some little-and-often problem solving practice. Highly recommended.


Geometric squares


Look at the shapes in this square grid. The hexagons on each row, column and diagonal are made of the three shapes in that row, column or diagonal respectively. This completed ‘Geometric Square’ was posted on Twitter by @PardoeMary.

Can your students create another geometric square? Can they create tessellation patterns of the results? What if you didn’t use hexagons, but some other starting shape?

Click here to download a blank template: Geometric squares

Join me at the English and Maths 2015 Conference #oeEM15


Optimus Education have launched a new conference: English and Maths 2015- Effective Teaching Strategies to Meet New Accountabilities.

I’m delighted to have been asked to speak at the event on 22nd October and will run a workshop session on: designing a maths curriculum that ensures learning is conceptually deep, retainable and context-transferable; maximising depth of learning and procedural fluency.

Other speakers include Matt Parker (of Numberphile and Standup Maths fame), David Didau (influential blogger) and Danielle Bartram (author of the well-known @MissBsResources).

Download a pdf brochure for the conference here.

It would be great to see you there! Book early for up to a £60 discount.

Follow @JamesAllen_OE and @OptimusEd for more info.

Powerful, diagnostic, games-based AFL- Kahoot!!!


Kahoot is a tremendously useful, free AFL tool I recently came across after a Twitter recommendation. Students can use any web-enabled device (any OS platform) to take part in games-based quizzes. There are thousands of quizzes publicly available, or you can create your own in a simple drag-and-drop editing tool and add to the public pool. There are many maths-based quizzes already in the public pool suitable for KS3 and GCSE maths classes.

Students can access the quiz without needing logins and passwords, making it easy to use in class. They simply go to in their web-browser and type in a game-pin number. The game begins and students answer multiple-choice questions, scoring points for speedy correct answers. The leaderboard is updated after each question to build the tension!

After the quiz you can download a colour-coded spreadsheet of all your students’ responses. The students love it- I love the AFL!

See the video below of Kahoot in action to get a feel for it. To register for an account, go to

Kahoot in action

Algebra Tiles- from counting to completing the square

I have become increasingly interested in visual models recently as a way of introducing topics. Visual models have the power to illustrate concepts in their rawest, simplest form without the misleading associations that words and abstract notation can introduce. I’m convinced concrete/visual model introductions should form an increased part of my practice, but the question that interests me is which visual models should I use? There are obviously many considerations, but one is how comprehensively they cover the syllabus. A visual model that demonstrates expanding brackets particularly well would not be that useful if it could not also demonstrate the concept of factorising. Through much reading I believe I have found two visual models that cover the vast majority of number and algebra topics. They are almost mutually exclusive too, complementing each other by covering different, rather than overlapping parts of the curriculum.

Slide19The first visual model is bar modelling. The more I experiment with it in lessons, the more I am convinced it can open the door to so many topics that many mid-to-low attaining learners previously found inaccessible. I wrote recently (click here to view) about all the many different topics bar modelling can be used for- from basic fractions work, through FDP, ratio and up to reverse percentages and compound interest.

In contrast to bar modelling, I believe algebra tiles is a very powerful concrete/visual modelling technique that can be used to develop conceptual understanding of topics. In this post I will explain how the algebra tiles model works and demonstrate how it can be used to introduce a great many topics that bar models are not suitable for.

Before I go any further I want to make it clear that I do not think all students need to experience visual models when topics are introduced. Teachers should use them selectively when they think they are needed and will support students’ learning. Visual models often fall down for particular variations of question and they are not meant to replace abstract reasoning, merely be a bridge to it for students that need it.

Basic rules of algebra tiles

It is anticipated that you will make these tiles as physical resources for students to use.




Four operations using algebra tiles including with negative numbers











Types of number




All of the above concepts are relatively trivial and you may have used variations on the algebra tile model when teaching them. For example, using multi-link cubes to derive factors or show the square numbers. However, what is powerful about the algebra tiles model is how seamlessly is transfers to algebraic concepts…

Collecting like terms




Solving equations





Expanding brackets










Completing the square


Like bar modelling, I like the way that algebra tile models cover so much of the syllabus. Whilst bar models cover mainly FDP and ratio, algebra tile models seem to cover the other topics in number and algebra. Rather than competing with bar modelling, it seems to complement it.

I’ve not used algebra tiles in my lessons yet and will do some experimenting in the coming months. I particularly want to focus on whether their use enhances students’ understanding of concepts- whether they see the links between the visual and the abstract.

At present I’ve not found many resources out there to scaffold and support this approach. If you know of any please do let me know so I’m not reinventing the wheel! If you have a go with algebra tiles (or bar modelling) yourself let me know how you get on and what you learn as you go along.

Happy teaching.

Building interleaving and spaced practice into our pedagogy

This post builds on the conclusions I reached in the post Forgetting is necessary for learning, desirable difficulties and the need to dissociate learning from performance. Make sure you’re up to speed with that post before reading on.

In my department we have historically done blocked practice from year 7 to mid-year 11 and then switched to interleaving from Christmas in year 11 onwards in the form of weekly past paper homeworks. According to Bjork and Rohrer’s research we should see greater sustainability in learning outcomes from the interleaving part of our practice. I thought I’d dig into our own assessment tracking data to see if it agrees with their claims- it does!

The following graph shows the mean assessment level (in terms of progress to go to meet Minimum Expected Progress) of our 2013-14 cohort as they progressed from starting GCSE maths in year 9 to their final GCSE grade:

KS4 whole cohort mean progress tracker

It would appear that the story in our assessment data agrees well with Bjork’s and Rohrer’s findings on interleaving and spacing. This cohort went on to achieve excellent progress and attainment when compared with national averages (80% A*-C), but it was only when they started the interleaved practice sets of questions that the retention and transferability of their learning began to build substantially. Interleaving is effective because it gives students experience in selecting a strategy to solve a problem as well as executing the strategy. In blocked practice we give students the strategy and they don’t gain experience in selecting it.

I plan to bring interleaving and spacing forward in the department’s practice right to day one in year 7 in our new SOWs. There will still be blocked practice during the early stages of learning, but students will get interleaved questions at regular intervals right through the five years. Currently, I’m thinking of this being in the format of open-book end-of-unit activities that recap the content of the whole topic, then another to recap topics taught cumulatively to that point. Bjork says low-stakes/high-frequency is important for these to be successful, hence these being open-book activities. Spacing is inherent in interleaved practice, but I am also planning for starters to be used for retrieval events of previous material.

There still remains the issue, as discussed in the previous post, that if we are doing this correctly, in-lesson performance of students will be lower. As Rohrer showed, an interleaved-taught unit showed significantly less in-lesson progress, but ultimately three times the long-term retention and transfer of student learning. Managing staff/student expectations isn’t going to be easy.

Food for thought.

Detail behind the data:

Students sat full GCSE exam papers during the assessments. The grade boundaries on all assessments up to the exams were higher than the final GCSE exam turned out to be (35 for a C) by approx 0.3 levels. This would have the effect of reducing the interleaving gains to 1.0 levels over the ⅔ of a year- still considerable and a step change from the rate of progress during blocked practice.

Forgetting is necessary for learning, desirable difficulties and the need to dissociate learning and performance

How many questions should I give students to work on after my instruction? Should I group all the questions together or space them over time? Should I ‘block’ questions on the same topic together or should I mix them with questions from other topic areas? Is ‘over-learning’ an efficient strategy for boosting student outcomes? Should I always be using high-frequency formative assessment techniques to guide my instruction? Is it possible, and if so, how do you measure the learning that has happened in a lesson? What does best practice look like in an assessing-without-levels world?  Within a progress context, are rapid and sustained mutually inclusive or exclusive?

We live in times of enforced, but relatively unguided change where schools are asking themselves questions about the fundamentals of pedagogy, learning and assessment. As I work on the evolution of my own department’s schemes of work (and the pedagogy I want these to promote) the above questions and more have been at the forefront of my thinking.

The beauty and intellectual intrigue of trying to understand learning stems from many sources: the difficulty to define it; the complexity and often non-intuitive strategies in creating conditions that nurture it; and the impossible, yet relentless focus on trying evaluate and optimise it quantitatively. Every teacher has their view and I’ve often found these differ more broadly in experienced colleagues than in those new to the profession. This is not a criticism, quite the opposite- it is the result of reflective thought after sufficient time and experience realising that ‘the fundamentals’ they learned during their training rest on boggy ground. In my own training, AFL was the non-negotiable, silver bullet to effective learning in the classroom. Now I’ve had a few years working with AFL, I’ve experienced how it can be a double-edged sword if the subtleties are not appreciated. I’ve seen many teachers (myself included in the early years) and government initiatives mistake students’ instantaneous performance for learning, through misunderstanding what AFL’s limitations are. Debate is healthy and useful, but the plural of anecdote is not data. Many middle and senior leaders are currently, in part because they have been challenged to do so by government policy, searching for self-evident teaching truths on which to rebuild their systems and pedagogy upon.


The profession has duly looked to the academic education research community for inspiration and authority on which to distil effective practice from the vast, turbulent-cylcical ocean of fashionable ideas and possibilities. The emergence of Tom Bennet’s ResearchEd community is a natural consequence of numerous teachers simultaneously dipping their toes into educational research findings and wanting to collaborate. I am one of those teachers and I write to share with you some significant CPD I have undertaken over the last year to try to gain insight into potential answers to the questions at the start of this article.

I became interested in the role of memory in maths education following a visit to King Solomon Academy where I met Kris Boulton. I learned about how this school, and its maths department under the leadership of Bruno Reddy, had designed a maths education curriculum that has subsequently resulted in their first cohort achieving 95% A*-C. The pedagogy they developed within their department was based on numerous academic sources relating to cognitive science. Once such source was the work of Robert Bjork, Distinguished Professor of Psychology at the University of California, Los Angeles. I have read his work myself over the last year and am at a point where I am beginning to be able to apply it in my own, and my department’s practice.

bjork_robert_webBjork is known for his framework that conceptualises learning within the context of memorisation called The New Theory of Disuse. The work builds on research by Thorndike in the early twentieth century and the observation that learned information fades away over time. Bjork supersedes Thorndike’s Theory of Disuse with his new theory because of research that showed memories do not disappear completely, they instead only become inaccessible over time. Bjork’s New Theory of Disuse puts forward the notion that anything learned (a memory representation) can be thought of as having strengths based on two indices: storage strength and retrieval strength.

Storage strength– reflects how inter-associated a given representation is with other memory representations. It is the depth of learning. Once accumulated, storage strength is never lost- the information remains in memory as evidenced by recognition, priming and, especially, relearning.

Retrieval strength– current ease of access. It is how primed or active an item’s representation is as a consequence of recency or current cues. Information in memory, no matter how over-learned becomes inaccessible with a long enough period of disuse. Retrieval strength falls over time. Recall of the memory representation builds retrieval strength.

Put simply, when we learn something, the depth of understanding to which we have learned it will never recede. Deep learning stays deep. However, unless we regularly recall it, the learning will become more inaccessible as time goes on.

You could therefore plot any memory representation on a two-dimensional plane of storage vs retrieval strength. Examples of typical representations in the four quadrants of the plane would be:

Low-storage/ high-retrieval– What you had for lunch yesterday. You remember it because it is recent, but you’ll soon forget it because you haven’t linked it to other memory representations- it wasn’t that important to you.

Low-storage/ low-retrieval– What you had for lunch this day eight months ago. Same as above but now with low retrieval strength because time has elapsed. The memory has become almost inaccessible.

High-storage/ high-retrieval– The birthday dates of your children. They mean a lot to you (are connected to many other memories) and you recall them regularly.

High-storage/ low-retrieval– The names of people in your Year 1 primary school class. You have forgotten them because you haven’t recalled them recently, but shown a list you could pick them out (storage strength hasn’t been lost and retrieval strength can be quickly rebuilt with recall).

We obviously want students to have memory representations of their learned material in the high-storage/ high-retrieval quadrant. The argument for a mastery, rather than KS3/4 spiral-based curriculum fits within this framework because of the time it creates to do deep learning. Teach-once-deep (but regularly recall) rather than teach-twice-shallow (and not have time to do much recall) makes sense if storage strength is only ever cumulative. Single 5 year curricular are becoming more popular and Bruno Reddy was amongst the first well-known of in the UK to adopt the format.

So, The New Theory of Disuse has been discussed within a memory representation context, the next step is to consider the implications it has on understanding what real ‘learning and progress’ look like in a maths classroom.

Corresponding to storage vs retrieval strength, there is a time-honoured distinction in academic learning research stretching back to the early twentieth century between learning vs performance.

Performance (retrieval strength)– what we can see, observe and measure at the current time in the maths classroom. It’s what I can see in students’ books or on their mini-whiteboards when I’m asking them a question similar to what I’ve just taught them to answer.

Learning (storage strength)– what I have to try to infer rather than what I can measure. The question of whether learning has happened is: “have those relatively permanent changes happened that will support my performance in the long-term.” Learning judgements are focussed on both retention and transfer (to different applications and contexts).

There is a severe danger that current retrieval strength (performance) can be interpreted as storage strength (learning). Bjork’s and others’ work shows that current performance is often a very poor indicator of whether learning has happened. The dissociation between the two can result from things such as predictability or current cues that are there now but won’t be later. These can prop up performance and give the impression rapid learning has happened when it hasn’t. In a 2014 talk at Harvard University, Bjork cites relatively old research that has shown there can be considerable learning in the absence of performance. In more recent research they have shown the converse to be true- you can have considerable increases in performance with virtually no resulting learning.

It is my belief that many experienced teachers have an understanding of the difference between performance and learning and its importance. I would however like to raise the question of whether some contemporary systems and common practices fail to dissociate the two? These include: lesson observation, work sampling, AFL (if the limitations are not understood) and potentially, assessment without levels. However, before I elaborate further on this point, it is important to understand in more detail about the interaction between retrieval and storage strength (they are dependent variables) and also research that has observed the interaction between the two in real world classrooms.

Ebbinghaus was the first to publish on the effects of how forgetting helps learning because frequent recall builds retrieval strength (and storage strength, although Ebbinghaus didn’t distinguish between the two) which then slows the rate of forgetting. However, subsequent research has shown more sophisticatedly that storage strength and retrieval strength are interrelated. As you recall a memory representation, both the retrieval and the storage strengths increase. The degree to which they each increase is dependent upon their relative strengths at the time of recall. Increments in retrieval strength are: a decreasing function of the item’s current retrieval strength, but an increasing function of the item’s current storage strength. The deeper you have learned something previously, the faster you ‘relearn’ it. Conversely, the higher the current retrieval strength of a memory representation, the smaller the increments in storage strength (i.e. learning). Forgetting becomes necessary to reach a new level of learning. Something that is completely accessible (high retrieval strength) is completely unlearnable (cannot raise storage strength) in the sense of getting to another level of learning above that reached already. In other words, if it is memorised by rote alone through repetition that is too high-frequency, the ‘learning’ gains (storage strength increments) rapidly decrease in size. Therefore, conditions that reduce retrieval strength and build storage strength can enhance learning. Bjork refers to these as desirable difficulties because they prevent retrieval strength growing too quickly which would reduce learning gains. In summary, because forgetting enables learning, conditions of instruction that appear to create difficulties for the learner, slowing the rate of apparent learning, often optimise long-term retention and transfer, whereas conditions of instruction that make performance improve rapidly often fail to support long-term retention and transfer.

English: Hermann Ebbinghaus

English: Hermann Ebbinghaus (Photo credit: Wikipedia)

One such desirable difficulty is The Spacing Effect. Given a constant number of questions that you ask students to complete, they will have better long-term recall if you space the practice with time intervals (in the order of days) in between rather than mass practice where they would do all of the questions during one session. Mass practice is advantageous if you’re measuring performance (short-term) rather than long-term learning. If you do mass practice trials your students will appear to be learning rapidly in comparison with spaced practice, where performance (retrieval strength) will be lower. However, if you do spaced practice, the storage strength (learning) grows faster than if you do mass practice. Bjork cites research by Professor of Psychology at the University of South Florida, Douglas Rohrer that has demonstrated the long-term learning benefits of spacing over massing specifically within the context of maths education.

In a 2007 paper, The shuffling of mathematics problems improves learning, Rohrer & Taylor published results of two experiments, one of which looked at the performance of undergraduate (non-maths specialist) students who were subjected to lessons on a maths topic unfamiliar to them. Half of the students (spacers) did spaced practice (over two weeks) whilst the others (massers) did mass practice of the same number of questions as the spacers, but in a single session. The students all sat an assessment one week after their last practice session. The spacers outperformed the massers scoring an accuracy of 74% vs 49% on the assessment.

Rohrer goes further in his research to attempt to understand if there is an optimum time interval between spaced practice sessions to maximise long-term recall. In a 2008 paper, Spacing effects in learning- a temporal ridgeline of optimal retention, Rohrer et al publish results from experiments that have been synthesised into mathematical functions that give the recall success in terms of both the study gap and the test delay. The variables are dependent, i.e. there is no single study gap that produces optimal recall, it varies according to the test delay. However, certain generalisations can be made. Spacing over single-day intervals is too short- retrieval strength gets boosted too-high, too-fast and storage strength growth is quickly limited. If the test is a reasonable length of time in the future, say 200-300 days, the optimum spacing is approximately 20 days. Over shorter test delays, say 70 days, the optimum spacing is approximately 10 days. Fortnightly recall seems a practical conclusion for practitioners who operate within the real world to adopt.


Rohrer also tested the strategy of over-learning to see if it benefits long-term retention and transfer. Over-learning is defined as giving students significantly more practice questions to complete than other students. In short, whilst there were short-term performance gains, long-term learning was no better. I.e. if students do 5 questions correctly they will retain their learning just as well as if they do 30 similar questions. Over-learning is an ineffective use of time.

The implications for lesson design and curriculum planning are clear, but I will hold back from elaborating on them until I have discussed other desirable difficulties in due course.

However, before we go any further, it would now be appropriate to discuss the omni-present mantra of rapid and sustained progress within the context of the aforementioned research findings. Quite simply, rapid and sustained progress is an oxymoron! If you raise performance (retrieval strength) rapidly, you sacrifice possible learning (storage strength) and sustainability. If you perform highly on a topic too quickly, the automaticity you attain limits the possible long-term learning (storage strength) gains. The storage strength increment is a negative function of the current retrieval strength. Lessons that show the most progress (higher performance) result in sub-optimal retention and transferability (learning). To maximise long-term learning we need to limit performance (retrieval strength gains) in order to optimise storage strength gains. Lessons need to be of a high-challenge nature that prevent automaticity forming too early. This builds storage strength which then ensures subsequent recall events will see large gains in retrieval strength. Do grade 1 practitioners, whose classes show the most progress in lessons always get the best student outcomes when it comes to exam time? Do you know ‘solid grade 2’ practitioners whose classes do just as well, if not better in exams? I’ve been aware of this generically for a while now, but the realisation that retrieval and storage gains are negatively correlated does provide more clarity- the ‘grade 2′ teachers that don’t strive for rapid in-lesson progress have students with more sustained learning. There are no quick fixes to long-term learning- surprise, surprise. In fact, it’s worse than that- the findings of this research show that shortcuts are actually destructive to learning. Buy rapid performance today, pay with sustainability in 12 months’ time! The implications for lesson observations are considerable.

Ofsted no longer grade individual lessons. There is a movement within many schools at present to stop grading lesson observations. This seems logical given Bjork’s message that performance needs to be dissociated from learning. Using one to infer the other has been shown to be unreliable. By observing the performance gains of students in lessons, can we infer reliably the size of the learning gains? Can the observer accurately predict what the students are going to retain and be able to apply to a different context six months after the lesson? Having considered this for months, I cannot think of any way in which learning in any observed lesson can be reliably and accurately measured if we, as the research advises us, dissociate it from performance.

Ending the grading of lessons would potentially have advantages resulting from the focus of non-judgemental debrief conversations being based on strategies for maximising learning, rather than performance, the later of which is common in a good-to-outstanding-led lesson grading culture. In a no-grading, good-to-outstanding culture the conversations could instead be more centred on topics related to maximising learning (long-term retention and transfer), rather than ways to rapidly (and to the detriment of learning) raise performance over short time intervals? “What are you going to do going forwards to ensure that what was covered in today’s lesson becomes learned, i.e. there is long-term retention and transfer of the material?” In the very least, if progress is going to be graded, a ‘progress over time’ judgement is much more desirable than a ‘progress in a lesson’ grading.

Back to Bjork’s desirable difficulties. Using high-frequency, low-stakes testing specifically as learning events (even without feedback) have been shown to have significant positive effects on storage strength gains. The low-stakes part is critical to the effectiveness of this desirable difficulty. Study-test-test-test is more effective than study-study-study-test.

Another highly effective desirable difficulty that increases storage strength gains is contextual interference, one example of which is interleaving. Most practice sets of questions that we get students to work on during study are blocked into topics. We teach them a strategy and they answer a series of questions, all of which require the same strategy. Interleaving is when, instead of blocking, you give students a mixture of questions on different topics (that include and precede today’s lesson). Research findings into the effects of interleaving in maths education say that if my students are learning lots of things, I will maximise long-term retention and transfer (learning) if I arrange the instruction to maximise the possible interference between them, i.e. don’t do blocking. This is counter-intuitive but well-researched. Rohrer believes that one reason interleaving is effective is that it gives students experience in selecting a strategy for answering questions. In blocked sets of questions, the same strategy is repeatedly used and so students only gain experience in executing the already modelled strategy. Transfer in learning requires students to select strategies, before executing them. If all they ever do is blocked activities they never experience the need to select strategies until assessment time.

In a 2007 paper, Rohrer & Taylor had students learning to use and apply different maths formulae to solve problems. Some students did blocked practice, some did interleaved questions. When measured at the end of the lesson(s) students who did blocked practice outperformed the interleaved practice students with an accuracy of 89% vs 60%. However, and very importantly, when tested after a one week time delay the percentages were 20% vs 63% respectively! The interleaved practice students outperformed the blocked practice students 3:1 on a delayed assessment! The lessons that would have been judged to show the most rapid progress resulted in significantly less sustained learning. The lessons in which performance was lower resulted in triple the learning gains. In numerous other studies, Rohrer and others have replicated these findings and have shown why it is imperative we separate learning from performance and not use one to infer the other. If we are talking about the difference between good and outstanding lessons, we must take a long-term perspective, evaluating and discussing strategies and pedagogy that are learning-enhancing rather than performance-enhancing focussed.


Desirable difficulties are desirable because responding to them successfully engages processes that support storage strength gains. They limit gains in retrieval strength and prevent learning becoming automatic too early which would limit potential further increases in storage strength. However, they become undesirable difficulties if the learner is not equipped to respond to them successfully. For example, the student that has low-working memory will struggle if questions early in learning a new topic are interleaved and they are trying to simultaneously select between numerous potential strategies. In this case, if they are placed into cognitive over-load by interleaving, and they don’t have the required self-control and resilience, it would have negative effect. Optimal storage strength gains require sub-optimal performance during lessons and students (and teachers) need to be comfortable with this and remain motivated to get the benefit. This is a considerable challenge. I’ve always found showing students their progress to be a good way to boost their intrinsic motivation. Making lessons more challenging through the introduction of desirable difficulties, in the knowledge that performance of students, if we are maximising learning gains, will be lower, is a hard-sell to students who are motivated by seeing rapid performance gains.

Within the community of teachers currently reading educational research, learning styles has become a bit of an in-joke. They are repeatedly cited as an example of a seemingly intuitive idea that isn’t supported by research evidence. Bjork explains they are based on the meshing assumption that if learning is aligned to a particular personal format preference, it is easier to acquire and thus you will consequently accumulate more of it. The meshing assumption, that easier learning results in more learning, is false for reasons already discussed- limiting retrieval strength gains with desirable difficulties maximises storage strength (learning) gains. Learning styles are the opposite of a desirable difficulty. They are a quick-win for performance, and consequently a loss for learning.

If Bjork’s work is accepted as contextually relevant and applicable for secondary maths education, I believe the following implications seem logical deductions:

  • Maximal learning (storage strength gains) requires the limitation of retrieval strength gains, particularly early on in instruction, through the use of desirable difficulties such as: interleaving, spacing and high-frequency/ low-stakes testing. Nearly all maths textbooks have massed, blocked question sets. This needs to change. As Bjork points out, the same questions could be used, it is just the ordering that needs to change. Students need to be experiencing mixed-topic question sets regularly, not just during assessments. This gives them the necessary experience in selecting strategies in addition to executing strategies.
  • Rapid and sustained progress is an oxymoron- the two are inversely proportional. We should concern ourselves with understanding how to generate sustained progress and this should be the focus of pedagogical practice, discussions, interventions and performance management systems.
  • We should dissociate learning and performance. This means not using performance measures to infer learning gains. Learning cannot be measured within the time-frame of a lesson. Work samples of students’ books does not tell you what the students have learned, only their in-lesson performance that day. Using AFL to assess in-lesson concepts currently being taught, only measures performance, not learning and so don’t use success on an exit ticket as proof of long-term retention and transfer (learning) of the material from that lesson. Using AFL to guide instruction within lessons is the right thing to do, but don’t use it to infer learning that has occurred within the current lesson. Assessment of learning rather than performance should feature a time delay from when the material was last covered and/or be contextually different, thus including a measure of transferability. In an assessing-without-levels world we must ensure we assess in a time-delayed way and be transfer-focussed if we are to avoid previous mistakes with formative assessment learning judgements being based on performance rather than learning (APP etc).

Finally, it should be reiterated that there is a natural tension between the motivation boost of students seeing high performance gains and the reality that slower ones lead to better learning. As Bjork puts it:

“If someone gave me a new course and said ‘do everything you know how to do to make students’ long-term memorisation of key concepts the best’, I could give that a big try; or if they said, ‘do everything you know how to do to get the highest course ratings’, I know something about that too; but what’s awful is that they would not be the same course. They would be quite different courses”.

There is clearly a need for students to understand the ideas of desirable difficulties, to some degree, if we need them to be comfortable with lower performance in harder lessons today, the benefits of which won’t pay off for many months or years ahead. Many cultural and systemic expectations about what effective lessons look like may need reconsidering too. Teaching that facilitates outstanding student learning is different to teaching that facilitates outstanding student performance.

For more info on Robert Bjork’s work see: Go Cognitive

For Doug Rohrer’s publications see here.

CGP Maths Buster- a superb new learning and revision resource for GCSE maths

CGP Maths Buster

CGP are well-known for their excellent GCSE revision guides. Now they’ve taken their offering to a whole new level with GCSE Maths Buster. The £6 DVD ROM for PC & Mac is a comprehensive, interactive revision tool for GCSE maths students featuring:

Levelled practice– work your way through the entire maths curriculum ‘levelling up’ to unlock new content to study.

Timed tests– Take an on-screen timed test to identify your strengths and weaknesses. The software then suggests a revision plan tailored to your needs.

Practice questions– 55,000 exam-style practice questions. For each one view a worked example or see the notes from the CGP revision guide relevant to the topic.

Video tutorials– Watch video lessons for each topic.

Keep track of your progress– The software records your progress in each topic and provides an overview at any time.

Challenges– As you work through Maths Buster you unlock challenges such as ‘sudden fail’, ‘time trial’ and ‘against the clock’.

Practice papers– Print off practice papers. Complete answers and step-by-step video solutions are provided.

I’ve had a play with the software myself and am impressed with it. There are lots of revision offerings out there these days, but what I particularly like about Maths Buster is that it is comprehensive- it provides for all stages in the learning journey from video lessons, through practice questions and then onto exam preparation. The ‘smart’ side to the software that records individual students’ progress and then tailors its recommended lessons to their individual needs is very useful too.

For more info including a video introduction to the product and an online interactive demo visit