I’ve been using the Pedometer++ app since 2015 to track my daily steps and overall I’m very pleased with how it work and looks but a little while ago I noticed an issue: the number of floors I’ve supposedly climbed is wildly off in the app.

Daily and lifetime floor achievements in Pedometer++

A daily best of 12,403 would be the equivalent of climbing about five Mount Everests in a day. Exporting the data from the app and graphing it shows the problem in more detail:

Chart of steps per month

Something has obviously gone wrong with the floors data in 2015 and a quick email to Pedometer++ customer support confirms this:

Several years ago changes were made to the step database when new features were introduced in the app.

This created the problem where your data prior to November 27 2015 was corrupted.

Sorry but we cannot fix that problem now.

While they claimed that the data was unfixable, I decided not to take their word for it. The Pedometer++ app allows you to export the data from the app as a CSV or as a custom Export.steps file that can be re-imported back into the app.

Running the file command on this backup file reveals that it’s LZFSE compressed:

» file Export.steps
Export.steps: lzfse compressed, compressed tables

LZFSE is an Apple designed comporession algorithm that can be comporessed or decompressed with the compression_tool command on macOS:

compression_tool -decode -i Export.steps

Doings so reveals that Export.steps is just JSON encoded data:

{
  "StepSample" : [
    {
      "intervalStart" : 634993200,
      "stepCount" : 0,
      "floorsAscended" : 0,
      "distanceMeters" : 0
    },
    {
      "intervalStart" : 634989600,
      "stepCount" : 24,
      "floorsAscended" : 0,
      "distanceMeters" : 20
    }
    // ...
  ],
  "PushSample" : [/* ... */],
  "StepCount" : [
    {
      "intervalStart" : 468194400,
      "stepCount" : 3827,
      "floorsAscended" : 3458, // ← distanceMeters has overwritten the floors field
      "distanceMeters" : 3458
    },
    {
      "intervalStart" : 468108000,
      "stepCount" : 7673,
      "floorsAscended" : 6927, // ← distanceMeters
      "distanceMeters" : 6927
    },
    // ...
  ],
  "GoalPoint" : [/* ... */]
}
  • StepSample contains step, distance, and floors data in 15 minute intervals. This is the new data format used since November 2015
  • PushSample contains the equivalent data for wheelchair users
  • StepCount contains step, distance, and floors data in 15 minute intervals in the older format, in 1 day intervals.
  • GoalPoint stores your daily step goals over time

For all of these, the intervalStart field is the start of the interval expressed as an Apple Cocoa Core Data timestamp.

Luckily for me, I only need to touch the older data stored under the StepCount field which greatly simplifies things.

My first idea was to set the number of floors to zero for any day with more than, say, 100 floors ascended. However, looking at the data in the Health app on iOS revealed that the flights climbed numbers there looked perfectly reasonable for 2015.

The Health app lets you export all your health data as a zipped XML file. The file size for the XML file was almost 2GB once unzipped, so I chose to use Nokogiri’s SAX parser to process it.

Beware, the code below isn’t particularly pretty but it does the job:

#!/usr/bin/env ruby
# frozen_string_literal: true

require "bundler/inline"
require "time"
require "set"
require "open3"
require "json"

gemfile do
  source "https://rubygems.org"
  gem "nokogiri"
end

# <Record
#   type="HKQuantityTypeIdentifierFlightsClimbed"
#   sourceName="Mercury"
#   unit="count"
#   creationDate="2014-11-28 18:03:26 +0200"
#   startDate="2014-11-28 17:54:52 +0200"
#   endDate="2014-11-28 17:54:54 +0200"
#   value="2" />

class FloorsParser < Nokogiri::XML::SAX::Document
  HKQuantityTypeIdentifierFlightsClimbed = "HKQuantityTypeIdentifierFlightsClimbed"

  OUTPUT_COUNT = 10_000

  attr_reader :start_date_filter, :end_date_filter, :start_time_filter,
    :end_time_filter, :allowed_sources, :found_sources, :data

  def initialize(start_date:, end_date:, allowed_sources: [])
    @start_date_filter = start_date
    @end_date_filter = end_date
    @start_time_filter = start_date.to_time.freeze
    @end_time_filter = Time.new(
      end_date.year, end_date.month, end_date.day,
      23, 59, 59
    ).freeze
    @allowed_sources = allowed_sources
  end

  def start_document
    puts "Starting document processing... (each dot represents #{OUTPUT_COUNT} records)"
    @found_sources = Set.new
    @data ||= {}
    @processed = 0
  end

  def start_element(name, attrs = [])
    case name
    when "Record"
      @record = attrs.to_h

      @processed += 1
      print "." if (@processed % OUTPUT_COUNT) == 0
    end
  end

  def end_element(name)
    case name
    when "Record"
      if @record && @record.fetch("type") == HKQuantityTypeIdentifierFlightsClimbed
        @found_sources.add(@record["sourceName"])

        if allowed_sources.include?(@record["sourceName"])
          start_time = Time.parse(@record["startDate"])
          end_time = Time.parse(@record["endDate"])

          return unless start_time >= start_time_filter && end_time <= end_time_filter

          date = start_time.to_date

          if date != end_time.to_date
            puts "\nDay crossing record! #{date.iso8601}"
            puts "    (#{(end_time - start_time).to_i}s) [#{@record["sourceName"]}]\n"
          end

          @data[date] ||= {}
          @data[date][@record["sourceName"]] ||= 0
          @data[date][@record["sourceName"]] += @record["value"].to_i
        end
      end
      @record = nil
    end
  end

  def end_document
    puts "\n---\n"
    puts "Date range: #{start_date_filter.iso8601}#{end_date_filter.iso8601}"
    puts "Found sources: #{@found_sources.to_a.join(", ")}"
    puts "Allowed sources: #{allowed_sources.join(", ")}"
    puts "Data count: #{@data.keys.length}"
  end
end

# Create our parser
floors_parser = FloorsParser.new(
  start_date: Date.new(2014, 1, 1),
  end_date: Date.new(2015, 12, 31),
  allowed_sources: ["Mercury"]
)
parser = Nokogiri::XML::SAX::Parser.new(floors_parser)

# Send some XML to the parser
parser.parse(File.open("./export.xml"))

data = floors_parser.data

puts "Decompressing steps data"
decoded, stderr, status = Open3.capture3("compression_tool -decode -i ./Export.steps")

if status.success?
  steps = JSON.parse(decoded)

  puts "Fixing floors ascended data"
  steps["StepCount"].each do |item|
    # Parse the Core Data timestamp
    time = Time.strptime((item["intervalStart"] + 978307200).to_s, "%s")
    # item["intervalStart"] = time # for debugging

    if (flights = data.dig(time.to_date, "Mercury"))
      item["floorsAscended"] = flights
    else
      item["floorsAscended"] = nil
    end
  end

  puts "Replacing missing floors ascended data with the daily average"
  values = steps["StepCount"].map{|i| i["floorsAscended"]}.compact
  average = (values.sum.to_f / values.length).round
  steps["StepCount"].each do |item|
    item["floorsAscended"] = average if item["floorsAscended"].nil?
  end

  puts "Saving the fixed data"
  filename = "Export-fixed-#{Time.now.to_i}.steps"

  Open3.popen3("compression_tool -encode -o ./#{filename}") { |stdin, stdout, stderr, wait_thr|
    stdin.write JSON.pretty_generate(steps, space_before: " ")
    stdin.close

    exit_status = wait_thr.value

    if exit_status.success?
      puts "Success! Generated #{filename}"
    else
      exit exit_status.to_i
    end
  }
else
  puts "Decoding steps failed"
  puts stderr
  exit status.to_i
end

The process goes roughly like this:

  1. Parse out the data from the XML file exported from Apple Health
    • In my case I only cared about data from 2015 and from just one source: my iPhone from the time (named ‘Mercury’)
    • The export has the flighs ascended data in < 1 day intervals, so combine those into a single value per day
    • Plus some code to guard against edge cases, like intervals crossing day boundaries
  2. Decompress and parse the steps data exported from Pedometer++
  3. Iterate over the StepCount data and repalce the floorsAscended number with the value from the parsed Apple Health data
    • If there wasn’t any data for the given day, use the average from that days with data
    • There weren’t all that many days with no data, so I could really have just set the floors value for those days to zero…
  4. Generate JSON for the fixed data and compress it with the compression_tool command

Parsing the XML data is the longest step and it takes about 80 seconds to run.

To import the fixed data, I deleted and reinstalled the Pedometer++ app and then opened the fixed .steps file from an email that I sent to myself. I first tried uploading the fixed file to iCloud Drive and opening it from the Files app on my phone but for some reason that doesn’t work (Pedometer++ launches but nothing is imported).

Finally, checking the achievements in Pedometer++ reveals that the fix worked:

Daily and lifetime floor achievements in Pedometer++ after the fixed data was imported