Journaled Flash Storage – Emulating EEPROM over Flash, ACID Transactions, and More. Part I – Flash vs EEPROM

 
Author:  Follow: TwitterFacebook
Job Title:Sarcastic Architect
Hobbies:Thinking Aloud, Arguing with Managers, Annoying HRs,
Calling a Spade a Spade, Keeping Tongue in Cheek
 
 

In recent years there is a strong trend related to new MCUs, with more and more of them not providing traditional EEPROM, but proposing to use Flash as a means of persistent storage instead. While the idea of using circular buffer for emulating EEPROM over Flash is well-known and widely used, implementation specifics are not that universally accepted and differ greatly.

EEPROM vs Flash

In this article we will discuss the issues related to using Flash as a persistent storage/non-volatile memory, existing implementations which address emulating EEPROM over flash, and will propose our own implementation which has quite a few desirable properties, while keeping implementation complexity at reasonable level.

Due to the size, the article has been divided into 3 parts. The first part compares Flash and EEPROM from developer’s point of view, and defines APIs and properties which are desirable for application layer.

Upcoming parts of the article:
Part II. Existing Implementations by Atmel, SiLabs, TI, STM, and Microchip
Part III. JoFS Itself

Flash vs EEPROM – Similarities and Differences

First of all, we need to note that there are lots of similarities between EEPROM and Flash. Both are essentially non-volatile memories, and both can be programmed electrically (unlike, say, EPROM which requires ultra-violet light to be erased). From this 50’000-feet height, EEPROM and Flash indeed aimed towards the same task – saving our data persistently (that is, until we want to remove/replace the data).

However, as usual, the devil is in the details, and there are substantial differences between EEPROM and Flash, which affect implementations greatly. We won’t go into physical-level reasons for these differences (for some physics, please refer to [MakwanaSchroder]), but instead will just summarize them briefly:

  • Hare pointing out:You cannot erase single byte of Flash, but need to erase the whole pageFrom developer’s perspective, Flash has different APIs than EEPROM. While EEPROM has usual-for-memory API which allows to rewrite any byte (or at least a few bytes such as 2 or 4 bytes), Flash is very different in this regard. Very briefly, you cannot erase single byte of Flash, but need to erase the whole page; for further details, see “Working with Flash” section below.
  • Flash usually has much fewer rewrite cycles than EEPROM (ballpark numbers are 10’000 cycles for Flash, and 100’000 cycles for EEPROM)
  • On the positive side, Flash tends to have faster reading speeds than EEPROM, and to be larger than EEPROM (the latter – at the cost of program memory).

Working with Flash

As noted above, working with Flash is quite unusual compared to working with EEPROM.

First of all, Flash memory is always separated into pages. Size of these pages varies, but usually it is around a few Kbytes in size (2K being quite a common number). In some cases, however, instead of same-size pages, “sectors” are used, with size of sectors being different (usually much larger than that of pages); more importantly for our purposes, there are usually fewer sectors than pages, what has unpleasant implications for all the EEPROM-over-Flash emulation algorithms. In short – if you need to use Flash as a persistent storage, and have a choice between MCU-with-many-Flash-pages and an MCU-with-few-Flash-sectors, go for the one with many pages, you’ll get much less overhead when implementing persistent-storage-over-Flash (see further details below). Notable examples of MCUs with sector-based Flash include STM32F2 and STM32F4 series (though STM32F1 and STM32F3 are page-based), as well as Atmel SAM4 series.

Now, let’s discuss the life cycle of a typical Flash memory.

Originally (after being manufactured, or after a page erase), all bits of Flash have the same value; by tradition, it is usually interpreted as all bits being 1 (and therefore, all bytes being 0xFF).

Hare thumb up:While at this stage it is impossible to bring any of the Flash bits from 0 back to 1, it is perfectly possible to bring any bit from 1 to 0.Then, it is normally possible to bring any of these bits to 0 (to “write” 0 bit to Flash); while provided Flash-writing APIs are usually byte-oriented (or even word-oriented, with the word consisting of 2-4 bytes), it doesn’t change the principle behind – while at this stage it is impossible to bring any of the Flash bits from 0 back to 1, it is perfectly possible to bring any bit from 1 to 0. This process can be repeated as many times as needed. It should be noted that it is not a write which counts against cycle limit in Flash, but erasure (see below).

To bring any bit from 0 back to 1, it is necessary to “erase” it. The Big Caveat here is that with Flash, erasing cannot be implemented at a bit level or even at a byte level; Flash erasing can be made only at a page level. That’s exactly why the page size (and pages-vs-sectors issue described above) is so important: to have any kind of really persistent storage, you’ll need to have at least 2 pages/sectors for your Flash, and if these 2 sectors take 256K out of your 512K-Flash (while you need only to store 100 bytes, which is quite typical for an MCU app) – it is quite a waste.

What Exactly We Want to Achieve? APIs

Now, as we know what we’re dealing with, let’s think a bit about what exactly are the properties we’d like to have. Of course, it is all about persistent storage, but what are the APIs we want to have and the guarantees we want to achieve?

In general, the application level wants to store objects. So, let’s define a hypothetical API (provided to upper layers) consisting of the following functions:

  • void writeObject(int objectID,byte[] data)
  • byte[] readObject(int objectID)

Note that definitions are provided in a pseudo-language which assumes that error conditions are reported as exceptions, and that byte[] also contains array size; in practice, an efficient MCU-oriented APIs will be different (with lots of pointers and/or references floating around), but it is beyond the scope of this article.

To interact with Flash, let’s define the following API:

  • byte readByte(addr_t address)
  • void writeByte(addr_t address, byte value)
  • void erasePage(page_addr_t page_address)

So far so obvious, but once again – the devil is in details, so we need to discuss assumptions and guarantees which apply to the APIs above. They are extremely important, as it is understanding of guarantees which often separate a program-which-works-99%-of-the-time (equivalent to “fails every time the proverbial worst possible moment comes”) and a program-which-works-100%-of-the-time.

What Exactly We Want to Achieve? Assumptions and Requirements

For the moment let’s assume that our objects-to-be-stored are relatively small compared to the size of allocated Flash pages (strictly speaking, compared to SZ=PAGE_SIZE*(N_PAGES-1)). How small maximum object size should be compared to SZ, varies, but for practical purposes, we expect SIZE(OBJECT) to be at least 10x smaller than SZ. Dealing with large objects is quite difficult and will lead to significant complications, and also to significant inefficiencies; we’ll discuss ways to add support for large objects (via file systems and/or databases built on top of JoFS) in Part III of the article. In the extreme case our objects can be as small as a single byte, allowing us to emulate EEPROM API right on top of our writeObject()/readObject() API described above.

This leads us to the following

Assumptions:

  • Dedicated Pages. We have at least 2 Flash pages dedicated to our purposes
  • Small Objects. Objects to be stored are small compared to SZ=PAGE_SIZE*(N_PAGES-1). In addition, objects are small enough to fit one page (accounting for page headers, checksums etc.)

Assertive hare:First of all, we want to be sure that whatever-is-already-written to our persistent storage, stays there; this property is known as Durability.Now let’s think what kind of guarantees we want to have from our storage.

First of all, we need to be sure that after we wrote an object, we can only read something which we wrote there. Let’s name this property Validity. Essentially, it is a prohibition on reading spurious data, which has nothing to do with whatever-we-wrote to our object. While with most of persistent storage Validity is taken as granted, we’ll see that for over-Flash simulations ensuring Validity is quite a challenge.

Now, as we know that the data is valid, we want to be sure that whatever-is-already-written to our persistent storage, stays there; this property is known as Durability. More precisely: as soon as our API writeObject() call returns (which means that writing the object is successful, otherwise an exception would be thrown) application can rely on the data being stored persistently, and available for the future access via readObject(). However, at any point before writeObject() API returns (for example, if writeObject() never returns due to power loss), Durability is not guaranteed (though object MAY still become persistent even if writeObject() didn’t return).

The next property we need is Atomicity. That is, even if our writeObject() call fails for whatever reason, we want to be sure that on the next restart, when we read our object, we’ll get either old version of the object, or the new one, but not half-of-the-old-object plus half-of-the-new-object. This requirement becomes very obvious in the following example. Let’s consider our object being a 2-byte integer, which has a value of 255 at the moment, and we are trying to write 256 over it (effectively incrementing it); without Atomicity guarantee we can end up in a situation when lower-byte is already rewritten, and higher-byte is not, leading to the value of 0 🙁 ; with Atomicity guarantee, however, we know for sure that even in the worst possible case we’ll get either 255, or 256 (which simplifies life of an application writer a lot).

These two properties (Atomicity and Durability) are well-known in database world; to complete the picture and make our storage an ACID-compliant database, we would need to add other two properties, which are Consistency and Isolation. Depending on exact definitions and APIs, it is possible to achieve these properties too (in particular, it is possible to achieve them with our Journaled Flash Storage described in Part III of the article), but for now we’ll leave them out of our requirements.

Arguing hare:In addition to these desirable properties, we need to define conditions (commonly referred as failure modes) when these properties need to be guaranteedIn addition to these desirable properties, we need to define conditions (commonly referred as failure modes) when these properties need to be guaranteed. For example, if Flash is fatally broken (so that the whole Flash is inaccessible), it is obvious that there is no algorithm which can provide any Durability; on the other hand, in case of a sudden power loss it is possible to guarantee both Durability and Atomicity. This means that we need to define what kind of failure modes we want to withstand.

From the practical perspective and for the purposes of this article, at the very least, we want to withstand sudden power loss at any point during our system operation (such things do happen a lot in an MCU world).1

Also quite obviously, we don’t want to rely that if power loss occurs during any modifying operation on the lower level (including page erase), then any of items-affected-by-operation-interrupted-by-power-loss, will be in a well-defined state. In other words, if the power loss happens during writeByte() call, we need to assume that after the power is restored, the-byte-we-were-writing-to can be in a state which is different from both original state and final state; if the power loss happens during erasing page – then the whole page can be in a state which is different from both original state and final state.

This requirement can come in one of two versions. In a “Stronger” version of the requirement, we will assume that an items-affected-by-operation-interrupted-by-power-loss, can be in any state. In a “Weaker” version of the requirement, we will rely on the physical properties of Flash, and will assume that only those bits which are different between original state and final state, can change. For example, with “Weaker” version of the requirement, if we’re writing 0xFA (which has only 2 zero bits) over 0xFF (which has all bits=1), then interrupting this operation can possibly result in only two intermediate states (in addition to 0xFA and 0xFF): 0xFE and 0xFB. In practice, we expect that “Weaker” version should be sufficient if underlying Flash consists of Single-Level-Cells (SLC), but for Multi-Level-Cells (MLC) relying on “Weaker” version of the Requirement will lead to problems. As of now, I am not aware of MLC Flashes being used as an MCU built-in storage, but (a) it may change in the future, and (b) some MCUs already support external MLC Flashes, so this limitation of “Weaker” Requirement needs to be taken into account.

Now, we’re ready to specify our

Requirements:

  • Validity. After an object is written, only the data which has been written to the object, can be read back. Validity doesn’t include Atomicity (see below), so for Valid but non-Atomic implementation it is possible to read back partial objects (part of the object being old, and part of the object being new).
  • Durability. All calls to writeObject() which have returned (and not thrown an exception), guarantee that whatever-was-written, stays there regardless of future power losses etc. etc.
  • Atomicity. Even if any of modifying Flash operations (such as writeByte()/erasePage()) is interrupted (by power loss etc.), on the next read it is either whole-old-object, or a whole-new-object which is returned; no “half-of-old-object and half-of-new-object” situations are allowed.
  • Resilience to Power Loss. Properties above MUST be guaranteed for a power loss happening at any point during algorithm operation.
  • Resilience to Arbitrary Data Resulting from Write Operation Interrupted by Power Loss
    • “Stronger” version. Properties above MUST be guaranteed even if: the byte addressed by interrupted-by-power-loss writeByte() Flash operation, can be in any state after the power is restored, and the whole page addressed by interrupted-by-power-loss erasePage() Flash operation, can be in any state after the power is restored.
    • or “Weaker” version. Properties above MUST be guaranteed even if: all the bits which are about to be changed by writeByte() Flash operation, in case of the operation being interrupted by power loss, can be in any state after the power is restored, and all the bits which are about to be changed by erasePage() Flash operation, in case of the operation being interrupted by power loss, can be in any state after the power is restored. If underlying Flash is MLC one, this “Weaker” version of Requirement is not sufficient.

1 If we’re not interested in withstanding power loss, why’d we need any “persistent” storage at all?

 

To be Continued…

Tired hare:In this part of the article, we’ve only managed to get to the point when we’ve defined how Flash is different from EEPROM, and defined what exactly are the properties application developers can expect from the persistent-storage implementation. In the next part of the article, we’ll analyze existing implementations of EEPROM-over-Flash emulations, and discuss their drawbacks.

Stay tuned!

EDIT: the following parts of the article have been published:
Part II. Existing Implementations by Atmel, SiLabs, TI, STM, and Microchip
Part III. JoFS Itself

Don't like this post? Comment↯ below. You do?! Please share: ...on LinkedIn...on Reddit...on Twitter...on Facebook

[+]References

Acknowledgement

Cartoons by Sergey GordeevIRL from Gordeev Animation Graphics, Prague.

Join our mailing list:

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.