LINISNIL

some complaints about github actions

May 14, 2025May 14, 2025 alan Leave a comment

i’ve been learning a lot about the github actions ci/cd platform lately – it’s a pretty nifty system and superior in dev experience compared to systems like jenkins in many ways. on the other hand, i’ve also encountered a handful of issues that i wish github / microsoft would address.

here are some of them…

lack of built-in static checker

i’m not even talking about a full blown type system here – even something basic that will automatically alert you if the workflow file you checked in contains obvious indentation or syntax issues. that said, i have been noticing that they flag situations when i’m passing data to undefined input params so it’s possible this is currently work in progress. i recently added https://github.com/rhysd/actionlint to our internal github actions repo because i wanted more immediate and clear feedback on why my workflow files are invalid. the standard error messages are pretty vague. i think this should just be built-in functionality rather than having to call out to a third party linter

no way to manually trigger a workflow run from branch

while you can manually RE-trigger an already run workflow from the history, there is no way to manually trigger a run from scratch. this is probably complicated by the fact that workflows can be triggered by many types of different events (push, pull request, a call from another workflow) and much of the steps in the workflow are coupled to the information that is available from the event context. for example, a job might only run on a label event where a label is of a particular value. or a context value is passed as an argument to another job but that value is only existent in some contexts but not others.

there’s typically a large number of jobs that don’t depend on a lot of event state that i think github should let you supply the trigger type and even payload. this would certainly make qa’ing actions easier. i think projects like https://github.com/nektos/act that attempt to simulate a actions environment using specified triggers is a step in the right direction, but even better is if github provided this natively

referring to a reusable component as an “action”

when i think of github actions, i’m thinking about the platform as a whole. but github decided to name a specific type of reusable component an action as well. what about reusable workflows? nope those are not actions. additionally, if you’re referring to reusable actions, there are different subtypes: composite actions (basically just reusable steps), container actions, and javascript actions. this makes googling / searching online for docs painful if you don’t already know the various contexts of the word “action”. this is like if jenkins decided to call their concept of shared libraries reusable jenkins..

global “github” context variable

during runtime, steps in a workflow can access information about the context in which it’s running via context variables. for example, a workflow is made up of jobs and a job is made up of steps. so the steps context variable refers to information about the steps in the current run and the jobs refers to information about the job in the current run.

okay but what about the workflow? well, turns out workflow metadata like name or file path are on the global github context variable, which feels like a dumping ground for anything the github team didn’t feel like breaking out into another separate context variable. there’s repository data, github token data, workflow data, triggering events data, etc.

on the one hand, i get that this serves as a useful catchall with a name that is unlikely to collide with any newly introduced components to the actions domain. but wow i’m always forgetting what’s actually defined in it

you cannot reference your own action from a re-usable workflow

lets say u wrote an action to be reusable to other repos / projects. now you have some reusable workflows for others to use as well, but of course being the DRY minded developer you are you figured hey, why don’t i re-use my own reusable actions!

well, you can – but it’s not as easy as referencing them by some relative path in the repo. why? because when your reusable workflow is called, the working context is the context of the caller repository where your action files do not live. github essentially treats the external workflow call as if that workflow was written in the project itself. the only way to reference the action is by treating it like a third party and reference the fully qualified version repo name and tag where it will go and fetch it across the network

if you think that sounds like a release management nightmare, it is. each time you make a change to an action that’s referenced by a reusable workflow, you need to perform two releases: one release for the action and then another for the workflow referencing the action. you might think that a possible workaround might be to dynamically reference the same tag used to invoke the workflow. great idea, too bad you can’t do that.

as a third party actions creator, you’re not required to pin your dependencies

there was a security incident this year where a github actions repo was compromised and an existing tag on that repo was updated to point to a commit made by attacker that dumps out secrets. the response to this at work was to not reference action by tags which are mutable, but by sha. git even recommends this now as a security practice, but unfortunately they don’t require maintainers to pin their dependencies. so while you may be pinning the sha of an external action, that action may still be pointing to other actions by tag so you’re as safe as the weakest link.

why is it so hard to clear caches…

github offers a REST api to list and delete caches but i wish this were an option through the job view. for example, i’d like to re-run a job on my branch but from a clean cache. this appears to be available through the UI, but i haven’t been able to actually filter by branches. i don’t think this works the way they advertise it to work. either way, i really wish this were just a checkbox on the job run itself

docker caches

April 3, 2025April 3, 2025 alan Leave a comment

“cache” is an extremely overloaded term in docker land nowadays because docker supports different types of caching. ill cover the three main cache terms – what they are and when they’re useful

build cache

this is what most people think when you mention a docker cache. the build cache caches built image layers with the cache key being a docker instruction and/or the checksum of file contents. not all instructions create new image layers that contribute to final size of image – basically any of the commands that can write to the filesystem like COPY, ADD, RUN will create new layers. any FROM command that pulls in another image will also be a new layer.

you don’t need to do anything special for the builder to make use of build cache – this just comes for free. you can try it yourself. if you build a dockerfile once, run it again and you’ll see that it skips the steps if nothing changed. invalidation happens automatically as well. if the instruction changes, cache is invalidated. if a file part of a COPY command changes, cache is invalidated.

the build cache is probably most useful locally when you’re working on a dockerized application and using the same internal cache on your host machine.

external (build) cache

this is just another form of a build cache, except instead of relying on the builders internal cache you can use an external cache storage. once you start building images in a ci/cd environment that may be running your image builds on different instances, it may not be sufficient to rely on the default internal build cache to store an image. one run of your build will cache its images on its own machine but a subsequent run on a different executor machine will rebuild because it has its own internal docker cache. this can be wildly inefficient

the solution for this docker offers external cache backends. these can be fully remote ones like s3 where you specify a bucket and region or more local (but outside of buildkits internal cache) cache stores that store the cache in a specific directory on the host machine

many companies run dev workflows using github actions and use gh actions as their primary ci/cd pipeline, and docker allows you to point the cache to githubs cache.

cache mount

finally, we have cache mounts. these are are newer addition to build caches and they’re often confused with build caches because of a couple of reasons

there’s cache in the name
this cache is available only during image build – not too different from build caches

so what’s the difference? you’re not using the cache to store image layers, you’re using it in a more narrow sense to store artifacts produced by a RUN command. cache mounts are intended to cache results of RUN instructions so that any subsequent executions of that command during build time will go faster.

it’s used by specifying --mount-type=cache and specifying the location. at the time of writing, you cannot specify an external storage for this cache so this is entirely internal to the builder on the machine. anyway here’s an example

FROM node:latest
WORKDIR /app
COPY package.json ./
RUN --mount=type=cache,target=/root/.npm npm install

in the above, the first ever build will copy package.json and install the NPM dependencies into the /root/.npm path of dockers internal cache. if package.json changes and both the COPY and RUN layers are invalidated, the NPM install will run a lot faster in the new build step because the dependencies are found in the cache.

as far as i know, at the moment supplying an external cache via --cache-from to the builder has no effect on the source of the cache mount. this makes it not very useful in situations like ci/cd where you’re dealing with ephemeral executors. you might be using an external build cache like s3, but cache mounts storage will still be internal to the builder instance on the machine. yeah, confusing.

once i actually learned the differences between these, searching docs became far easier. if i need to understand details on the build cache which caches image layers, ill search for build cache, image cache, or external cache. if i’m writing a new dockerfile that needs to cache dependencies for a particular build instruction to optimize local build times, i’ll look up cache mounts if i need an assist on the api

post half marathon injury: posterior tibial tendonopathy

March 27, 2025April 2, 2025 alan Leave a comment

after my half marathon, i was feeling a minor ache on the medial side of my left ankle. i was still able to walk fine immediately following the run, but once my adrenaline came down i noticed i was walking with a slight limp. initially, i chalked this up to just having very tired legs, but a day later it was clear that it wasn’t the same type of soreness i was experiencing in other parts of my body. i had a hard time going up and down stairs and noticed myself avoiding any sort of flexion (plantar/dorsi) on my left foot. uh oh.

so i googled “medial ankle pain runner” and one of the first results that came up was posterior tibial tendonopathy. the tibialis posterior muscle is as the name suggests in the posterior of the tibia – the muscle lies deep in the calf and it is connected to various parts of the inner food (navicular bone, cuneiforms, and some of the phalanges) through a long tendon called the posterior tibial tendon. even before trying the diagnostic exercises, i had a sense this was likely my injury because i was experience pain in my medial malleolus. turns out, the posterior tibial tendon actually passes right under the medial malleolus and its common to experience pain & inflammation of the tendon in that area as pain directly from the malleolus!

to diagnose, i have to understand the muscle function. the tibialis posterior muscle is responsible for

plantar flexion (the muscle in the back of tibia near claf contracts and pulls the heel of the foot up)
foot inversion (pulling foot in towards medial side) and adduction
arch control of foot. tibialis posterior dysfunction (not what i have) leads to a collapsed arch. arch control is a big part of foot stability – if you cant control the height of your arches, you will have a tough time standing on one leg

two main diagnostic exercises are single leg stands and single leg calf raises

even on tired legs, i easily did a calf raise on my right leg but i could not perform a single calf raise on my left. earlier that day, i also took a yoga class and could not hold a stable tree pose on my left leg. so basically, i’ve failed both tests. i cannot stabilize and i cannot plantar flex under weight. i don’t think i ruptured a tendon, because theres almost no inflammation and its not like im unable to walk. it’s likely just strained, but i want to give it a couple of weeks before getting professional opinion. the recommended protocol for recovery is pretty standard: RICE (rest, ice, compress, elevate) until the pain is gone and then gradually begin to re-introduce strength training. the recovery period for tendon injuries is in weeks to months – so its likely im going to have to cancel my upcoming april races. it’s day 4 since my half and i’m starting to feel much better (i can plantar flex now without weight with no pain). i can also initiate a calf raise but not without pain, so im going to avoid anything to aggravate it.

one of my big questions is – how the heck did this happen? i felt pretty well trained. i had done a 12 mile long run at the peak of my training block and have incorporated some hill training as well in the final weeks. one of the clues came from photos taken of me during the race – here’s an interesting one taken at the final 2 miles of the race. i annotated it with lines that highlight a form breakdown

there’s two problems i notice here…

the first is a pelvic drop where my right pelvis was dropping. this tilt towards the right tends to happen with weak or fatigued gluteus medius muscles. in the diagram below, the highlighted green muscle is the glute med. that muscle is primarily responsible for abduction and it plays a major role in keeping your hips stable when you walk. if you look at how it attaches to the greater trochanter, imagine if that muscle isnt able to contract – what happens? the opposite side will collapse. thats exactly whats happening in my photo. my left glue med was fatigued and i was experiencing a drop on the right.

pelvic drops are not great in any running situation because it triggers an entire chain of movements that create injury. i’ve had to deal with IT band syndrome two years ago as a result of having a pelvic drop. another interesting consequence of pelvic drop is over pronation, which i annotated in my running photo. im excessively pronating in that photo on my left leg. excessive pronation is sort of the smoking gun for why i have posterior tibial tendonopathy. i’m basically straining that tendon over and over again, because remember – that tendon is there to invert/pull the ankle inwards. pronation is the exact opposite of that!

take a look at this left foot. the very right photo is what my left foot was doing.

turns out, one of the ways your body compensates pelvic drops is to overpronate. the way it works is this – the drop of the pelvis (in this case, my right pelvis) shifts the center of mass slightly to the right. this causes the knee on the left to move more inwards to try to stabilize the body which causes the excessive pronation. even worse, ground contact time is increased because the body needs more time to recover from the pelvic tilt.

understanding the forces at play here really helps, because this gives me a lot of ideas on what i need to work on during strength training and what to pay attention to during long distance races like the half. ill likely want to strengthen all the muscles to contribute to excessive pronation as well as targeting all muscles like the glue med that contribute to pelvic stability. i might also need to work on increasing my easy long runs to improve my overall endurance so i don’t start fatiguing as early when im putting in max effort

anyway im thankful for the internet and im hopeful about recovery. will post updates here like i did with my knee pain from november of last year. weirdly .. i consider one of the blessings of having minor-ish injuries is that it teaches me so much about my body and anatomy. before this i had no clue about the muscles other than the gastrocnemus and soleus in my calf! it’s a lot of fun for me to learn about running mechanics and get better at troubleshooting issues that will inevitably come up in this very repetitive activity.

login.fr.cloud.gov 2019 oauth redirect_uri vulnerability report

February 21, 2025February 21, 2025 alan Leave a comment

i’ve been reading oauth vulnerability reports to better understand the various attack vectors. a common one is a case where the redirect_uri is not validated as part of the initial authorization request. i found an old hackerone report about this exact scenario

here’s the report

ok so login.fr.cloud.gov is clearly the authorization server. the attacker is using a registered oauth client with the provider that is configured with a different set of redirect_uris. im going to guess the attacker doesn’t actually own the client app in this case. the client_id is not a secret, so its fairly easy to get a hold of. once the attacker in this case constructs the link with their own custom redirect param, they can share it

as part of a phishing attack, an unsuspecting user can click on that link and authorize access. upon success, they will be redirected to a valid redirect url with a code parameter that can be exchanged for an access token. now that access tokens can be made against the server for user data

the reporter here is suggesting that an attacker can provide any malicious url, so that the authz code actually gets redirected to evil.com where the attacker can retrieve the code. for them to really do anything with the code, we probably have to assume that this is an implicit grant flow where the authorization step actually redirects with an access token (actually this reporter does mention access token in the report, so thats probably a safe assumption). otherwise, an attacker can’t really do much with the authz code without the oauth clients full credentials.

the simplest way to mitigate this attack is to make sure redirect uris are validated properly. nowdays it’s also not recommended to use implicit grants that do not require the oauth client to present an authz token along with protected credentials. if you have a confidential oauth app, you need to use the authorization code flow.

crafting interpreters 25.2 – compiler upvalues

February 1, 2025February 1, 2025 alan Leave a comment

the purpose of closure objects is to hold references to closed over variables. but how does it find those variables if they may or may not be on the stack? we cant rely on the exact mechanism of local resolution because locals are always guaranteed to be on the stack during a functions execution!

Since local variables are lexically scoped in Lox, we have enough knowledge at compile time to resolve which surrounding local variables a function accesses and where those locals are declared. That, in turn, means we know how many upvalues a closure needs, which variables they capture, and which stack slots contain those variables in the declaring function’s stack window.

the new abstraction introduced here is something called an upvalue. an upvalue is what the compiler sees as a closed over variable. what bob is saying above is that we can figure out exactly what our upvalues are at compile time and make sure that at runtime those variables are accessible on the vm stack

exactly how that is done is a bit more complicated and its not immediately clear when reading the section on upvalues how the implementation supports the eventual runtime variable capturing behavior. in a way he’s basically saying, here’s how we want to compile upvalues – trust me we’ll need this information at runtime when we create closures.

one of the first questions i had reading this section was, how does the vm at runtime, given these upvalue indices, differentiate between locals and upvalues? we know that locals get pushed onto the vm stack when they’re referenced by other expressions and the OP_RESOLVE_LOCAL calls index into the relative position of the stack inside call frames. but what about upvalues? not all upvalues are necessarily on the stack.

this wasn’t answered until later when he added an array of pointers to upvalues (ObjUpvalue** upvalues;) to closure objects. so these indices we’re building at compile time are going to index into that array in our closures. since these are pointers, they could be pointing at either captured that are still on the stack or maybe ones that bob eventually moves onto the heap.

from objfuncs to objclosures

February 1, 2025February 1, 2025 alan Leave a comment

at compile time, at the end of a functions block compilation we now emit a new instruction OP_CLOSURE that the VM will use at runtime to wrap our function objects within a new closure object. the idea is that we’re going to use this closure object to store references to closed over variables (upvalues).

as a refresher, each time we create a compiler instance per function declaration, we also create a new function object via newFunction.

 static void function(FunctionType type) {
    Compiler compiler;
    initCompiler(&compiler, type);
    beginScope();
    ...
    ObjFunction* function = endCompiler();
    // emit a closure instruction!
    emitBytes(OP_CLOSURE, makeConstant(OBJ_VAL(function)));
  }
  
  ...
  
 static void initCompiler(Compiler* compiler, FunctionType type) {
  compiler->enclosing = current;
  compiler->function = NULL;
  compiler->type = type;
  compiler->localCount = 0;
  compiler->scopeDepth = 0;
  compiler->function = newFunction();
  current = compiler;
  if (type != TYPE_SCRIPT) {
    current->function->name = copyString(parser.previous.start,
                                         parser.previous.length);
  }

now at the end of the function compilation we make sure to emit an OP_CLOSURE so that at runtime, we use that opcode to wrap the raw ObjFunction in a closure and push it onto the stack.

below is the disassembly of fun foo() { fun bar(){} }

> fun foo() { fun bar(){} }
== bar ==
0000    1 OP_NIL
0001    | OP_RETURN
== foo ==
0000    1 OP_CLOSURE          0 <fn bar>
0002    | OP_NIL
0003    | OP_RETURN
== <script> ==
0000    1 OP_CLOSURE          1 <fn foo>
0002    | OP_DEFINE_GLOBAL    0 'foo'
0004    2 OP_NIL
0005    | OP_RETURN
          [ <script> ]
0000    1 OP_CLOSURE          1 <fn foo>
          [ <script> ][ <fn foo> ]
0002    | OP_DEFINE_GLOBAL    0 'foo'
          [ <script> ]
0004    2 OP_NIL
          [ <script> ][ nil ]
0005    | OP_RETURN

there’s a couple of interesting things about this design choice

every function, regardless of whether they close over variables, will be treated like a closure at runtime. this adds both overhead through the creation of each closure function and indirection
closed over values are stored on the clojure instead of the function, which nicely reflects the reality that we made have multiple different closures of the same function!

calls and functions and why fixed stack locals don’t work

January 18, 2025January 18, 2025 alan Leave a comment

so far we’ve only been writing statements at the top level of the program. there’s no notion of a callable chunk of code. with the introduction of functions in chapter 24, all the current top level states like the compiler, locals, and chunks / instructions are moved into function objects

previously with locals we were effectively operating in a single function world. this effectively meant that all locals were allocated at the beginning of the global call stack. with functions that each have their own local environments, the author introduces an early idea that was implemented by fortran where different functions had their own fixed set of locals

this works if there’s no recursion and i’ll demo an example that shows why fixed, separate slots break down once you start to recurse:

fun factorial(n) {
    if (n <= 1) return 1;
    return n * factorial(n - 1);
}

factorial(3);

assume we give factorial its own fixed set of stack slots

Slot 0: parameter n
Slot 1: temporary result for multiplication (the value in slot 0 * factorial (slot 0 – 1))

now call factorial(2). this produces slot 0 = 2 and slot 1 = 2 * factorial(1)

then call factorial(1). this produces slot 0 = 1

OH CRAP, but that just overwrote slot 0 = 2 which we need to compute 2 * factorial (1) from the previous call. except now it ends up calling 1 * factorial(0) and screws up the entire expression

bob notes that fortran was able to get away with fixed stack slots simply because they didn’t support recursion!

lox vm local variables visualization

January 14, 2025January 15, 2025 alan Leave a comment

in chap 22 of crafting interpreters, bob nystrom walks through the implementation of local variables. it makes efficient use of memory by tracking local variable position and scope metadata during compilation phase and leveraging that to locate the correct value in the immediate proximity within the execution stack (where we expect all local variables to end up, unlike globals which are late bound and may be defined far away from where they’re actually used).

what i found most complicated about this chapter is the number of states you need to track and hold to understand how the compile and runtime stages work together. it helped me to write down a few essential states in trying to understand it, so i figured i translate those notes to some sort of visualization because i think it might help others too

here’s a visualization of the compile phase where we’re converting the tokens into a byte code instruction sequence (chunks). the arrow indicates the parse position where the vm is pointing to the source code and the variables on the right represent the state at that point.

side note: i didn’t bother doing character by character – i moved the arrow to positions where there are actually side-effects since not all tokens produce the sideeffects i actually care about for this demo.

and here is the runtime execution of the resulting byte code sequence. as you can see, the first thing that happens is that the literal number 13 is pushed onto the stack. every variable declaration’s value will be known at compile time.

however, notice that there is no information about what the name of that constant is. is 13 the value of “foo”? or something else? what’s cool about this implementation is that it doesn’t matter at this time because during the compilation phase, we’ve already figured out where that local is going to be on the stack for the variable foo. based on the information about locals and off sets in the previous phase, it’s going to be at position or offset 0 based on the metadata from the locals array that was getting constructed at compile time.

first half marathon and training plan

January 10, 2025March 24, 2025 alan Leave a comment

i’m planning on doing my first half marathon this year, the syracuse half marathon! i’m also going to be posting my training updates here, mostly for myself to refer to

the race is on march 23, so that’s 10-11 weeks from now. that’s plenty of time for a good training block. my A goal is to finish in 1:45 (about 8:03 mile pace, 25min 5k pace), my B goal is to finish in 1:50, and C goal is to finish just somewhere close to two hours (this is all based on my current threshold pace for 5k which i think is around 7:45 – 8:15 mpm) and keep it conservative.

training wise i’m adapting the novice marathon program in hal higdons Marathon Guide book for a half marathon. a couple of interesting parts of his training program is the long run mile step back every 3rd week and gradual increase of the mid week mileage. the purpose of the step back is to support recovery after a couple of consecutive mile increases before building back higher

unlike his program for the novice training instead of doing saturday long run i’m doing a sunday one followed by a recovery run. he also packs all three non-long runs together consecutively but i like having space between those runs for cross training / strength training or just rest – so i adjusted that too.

overall i’m optimistic about this program because it’s not too far off from my current weekly mileage and i’m coming off of a short break from running due to the weather lately, so i should adapt well to this but who knows. since i am going to be targeting a specific pace i know i need to throw some speed work and threshold runs in there so the breaks between runs mid week should help

here’s my full schedule (thanks claude ai for formatting my original csv into a table)

note: run 1 is following a long run, so that will be an easy run. run 2 and 3 will either both be easy if i’m not feeling great, but ideally one or both of them are threshold runs. will play it mostly by ear

Week	Run 1	Run 2	Run 3	Long Run	Total Miles
1	3	3	3	6	15
2	3	3	3	7	16
3	3	4	3	7	17
4	3	4	3	5	15
5	3	4	3	9	19
6	3	5	3	10	21
7	3	5	3	7	18
8	3	6	3	12	24
9	3	6	3	10	22
10	3	6	3	8	20

for race pace and finish times i like to use this chart.

training log

i’m going to keep short updates here as i progress

1/15

training going well, been hitting the workouts and also did a tuesday short 45min group running training sesh (polymetrics mostly) at gym
today did a 3miler on treadmill, 10min warmp up and 10 cooldown with threshold pace in middle
TIL that 1% incline is good for imitating wind resistance friction for treadmill + lower knee impact. makes sense
form / mechanics notes: working on landing softer, more knee drive and less lower leg extension
pace feels a bit quick – will work on increasing incline but reducing pace
also may look into interleaving outdoor runs with treadmills at some point, weather permitting…

1/16

OK, so today i think i’m officially starting to overtrain…. i did a 1hr yoga at 5:30, 45min circuit training and sprinting at 8 followed by a 3 mile threshold. um my right foot ankle felt wonky and weird to put pressure on. i think i also laced my shoes too tight on the right
i ALSO tried a slightly different gait (shortening leg extension to land closer to my center of mass) at the same time, which honestly felt better
i made an effort to run more lightly on the treadmill today (focusing on reducing impact sound mostly) and my stride felt much smoother
anyway, im pretty much set with my mid week mileage (9 total so far) so im just gonna rest up for my long run over the weekend (6mi)

1/19

completed my first long run of the halfy training!! ran outdoors for 6.2 miles. most of the route i picked was pretty snow packed and my legs were sinking with each stride. left calf muscles and achilles feel pretty sore – not sure if snow or new shoes (lone peak altras) or both
i had to avoid sidewalks in a couple of .5 – 1 mile stretches and ran pretty close to threshold pace because i wanted to get off the road quickly
feeling good though, the snow def. forced me to slow down for most of it. got to get in a bit of hill work at beginning at end too. overall great workout

1/20

did a 3 miler today, started at easy pace and then did threshold for about a mile before dropping back to easy. good workout, but in future im going to try to hold the pace for the entire session, and reserve threshold workouts for specific days. it does seem like the treadmill picks up my HR reader so that’s good!

1/22

currently in week 2 of my training block. did a 5km with 3 1k repeats at 90-95% max HR . felt really tough, esp. towards the end. i actually cut short the last repeat by about 200m for sake of time and also i was at my limit
good workout, but i think maybe a shorter interval like 400m would be good followed by 30s to 1m rests in future
wore olympus via 2 for first time today, bought a used pair for 80 bucks off ebay. really loving this new model. the via 1 has a very stiff / firm sole, and they seemed to have incorporated that feedback. on their site: “It’s that same high stack but with a softer midsole foam”. really like it now! might retire my via 1’s

1/24

did my final midweek 3mi workout of week 2! really didn’t really feel like it because I didn’t sleep too well and its cold af. snow day and streets were a bit unplowed so I decided to scrap my original plan of going to the gym and instead go outside. wasn’t too bad, although starting earlier might be better because a lot of people around 7-8 were pulling out of their garages to go to work
did a fartlek workout with about 100m hill repeats 3 times . the entire run was pretty hilly , about 300ft elevation gain so 100ft / mile . pretty tough workout
felt slight ache on left knee (weirdly my right knee has not bothered me at all) that went away with a quad stretch. will keep an eye on

1/27

long run on sunday ended up being 8 mi instead of 7 b.c of wrong turn. felt good. went out a bit too fast. should stick to 10-9min miles. ended up in a snowy patch which wasn’t great and had to walk for safety reasons
need better gloves if im to do more outdoor running…
feeling good, did a 3mi on treaadmill this morning easy run with a short 3-4min tempo.

1/31

did a final three mile run today at the gym for the week. was not feeling very motivated and pretty fatigued, probably from not great sleep this week. watch tells me that my HRV is low. was supposed to do a workout but I just took it easy
for the four mile run earlier this week, I did 4 800 hundred meter repeats. that was a pretty good workout, ran pretty close to my max heart rate

2/9

completed my 4th week of halfy training!! did 5 miles yesterday on the road, did it around 5pm and that was a bad idea b.c there were lots of cars. i should try to avoid side street roads unless running very early. run felt terrible at first because i had a heavy lunch only couple of hours earlier
next two weeks are going to be slightly higher in mileage. 8mile pace still feels a bit challenging, im getting a bit nervous about sustaining that for 13 miles… i feel like i need to extend my long runs a bit and work in actual speed sessions off treadmills to be ready. or bump up the incline? or maybe just alternate indoor and treadmill 2 miles at a time

2/12

finished first workout of the week, 5 miles. 1 mile warm up, 4x1000m at 7:40ish pace with 1.5-2min walk in between, 400m at 7:00 pace and remaining cool down. felts strong, wearing my hoka skyflows (varsity navy, wide) – they feel the best so far running. my olympia via and via 2 from altra both aggravate either my right big toe or right knee
next workout i want to shoot for a 3 mile tempo, would be nice to do it outside too but the roads are still icy.

2/17

did my 9mi long indoors, 3 on track, 6 on treadmill. left foot tingly at end… shoelaces too tight probably. small blister on left pinky toe. overall felt good
huge snow storm, may have to skip week day runs this week
so … i just realized that my table above actually has an extra run early in the training that means the last long run lands on race weekend, which was NOT the plan. i really only have 4 weeks left, not 5. i want to make sure i build up to 12 and have a taper, so next run will be 10. then 12. then we will taper off for 2 with a 10 and then an 8
for remaining weeks, i want to make sure at least one midweek run is at race pace (8min)

2/19

had a 5 mile workout today with 3 mile pace workout , had to cut cool down short because m blisters were so bad. left pinky toe and both sides of my big toes were very irritated. i was wearing my hoka skyflows. they do cramp my toes bit, as much as i love the support. i have a pair of injinji toe socks coming so we’ll see if that helps

2/23

did 10miler today in injinji socks and my altra torin 7’s (wide toebox). was successful b.c no new blisters! yay. did a 4 on track, 3 on treadmill, and final 3 on track. avg 10min mile pace
did truly easy pace and this felt really good, listened to atomic habits
at like 8/9 miles i can def. feel the fatigue start to set in, but wasn’t too bad
would be great to do strides next week at gym, building up to a 12 miler. although we’re also going to be in D.C by then, so may need to figure out what to do then. need to remember to pack outdoor running clothes

2/26

ran with a group today! we did a 5.8 mile loop at 6pm at about an 8min pace. felt great. didnt feel easy but also not hard – im feeling good with another 4 weeks to the half. need to stay consistent and healthy. wore my lone peaks, rei base layer wool. also 40 deg is bit too hot for mittens

3/7

did 12 miles at the gym yesterday, mostly broken up. 2.5 miles before a run class, then did the rest in a mix of treadmill and indoor track. feeling better today that i did after my 10mi run couple of weeks ago. had to skip the sunday long run b.c i was traveling this week, so a couple of midweek workouts got skipped. some sore calfs , achilles so could do some foam rolling. no blisters this time, wroe my torins
this weekend it would be good to shoot for a 14 mile run. preferably outdoors. im feeling pretty strong and i think overshooting a bit makes sense given that im aiming for a time goal for my first half. just take it real easy. real challenge will be deciding my route and making sure im recovering adequately the next couple of days

3/10

yeaterday i did a 11mi (instead of 10 according t training plan). about a 10:30 pace. wore the torins. the gloves were too cold about 5miles in , even with liners. need to stick with mittens in future. a single base layer felt fine though in 25-30 degrees, but i think another layer wouldve been comfortable. felt good for up to mile 9 / 10 and the last couple my legs felt real heavy and my left shoe felt really loose. actually i had to tie my left shoe once at mile 2 because it came loose, and i didnt want to take my gloves off to tie my shoes again!
would like to squeeze in a 10k tempo at race pace this week, and MAYBE a speed session

3/12

very happy w. workout this morning, did a 7:50ish pace for 5 miles and it felt good. not too hard and faster than race pace so im feeling confident. i was going 6:00 for a bit and that spiked my heart rate and i had a hard time recovering from that, so really need to make sure i dial in my pace during the race. cant go out the start any faster than 8min pace or im going to have a rough time
gonna do a very easy recovery run tomorrow on treadmill maybe a couple of miles and do the run class w. ryan. have another hard workout on friday that i want to hit, 3 miler. im thinking 10 min warm up, then 4×400 with 30-60s rest. then another mile cooldown. gonna scout out the condition of the track beforehand

3/20

in my tapering week, did all my runs outside last week and had good workouts. did an 8 miler outside over weekend and felt a bit of right toe joint pain, went away after i loosened my laces (i was wearing torin 7) and quickened my cadence. yesterday i did a hard 8×400 interval, was hitting 645/7 mpm pace for each 400 with 90s recovery. great workout, feeling sore today. thats prob my last workout though. only easy runs sprinkled with strides from here on out

3/21

1 mi warmpup and cool down, 1mi mile lt pace was doing a 6:43 mile pace, felt very difficult and def. past lt based on how my legs were fatiguing

post race update

finished in 1:43:14 – two minutes under my target time goal!! very happy with the result. the course had several long climbs the first 1-3 miles and rolling hills to like mile 8-9 and then a couple of more short hills at 10/11. i was able to comfortable settle into a 7:40ish pace for a good part of the race. i started pretty conservatively (though i did get caught up in excitement for 30 seconds and found my self at a 630 pace lol). the last 2 miles my legs were dead, my left ankle started to hurt and they just felt real heavy. didn’t take any gels , just had banana an hour before race and took in water and gatorade at aids stations. fuel wise i felt fine. i think my legs just aren’t conditioned to endure a sub 8 pace for over 10 miles tbh.

running economy and vo2

January 1, 2025January 4, 2025 alan Leave a comment

running economy is a complicated topic and hard to measure, but a common measure of economy is done through vo2 (volume of oxygen) measures as a proxy. according to wikipedia, “Those who are able to consume less oxygen while running at a given velocity are said to have a better running economy”.

i put together a few visuals to illustrate this concept better.

here’s a graph showing

oxygen consumption or vo2 on the y axis
velocity in meters per second on the x axis
as velocity increases, so does oxygen consumption. they increase together up to a point (vo2 max)
oxygen consumption plateaus / steady states at the vo2max at and beyond a specific velocity

now, if the athletes is able to train their aerobic system to run at the same velocity with lower oxygen consumption, you get this graph

the dotted black vo2 consumption at given pace is the original line. the new solid line is as a result of training
same pace, but lower o2 consumption. this athlete has improved their running economy!
similarly, if you graph the relationship between vo2 and velocity for different athletes, the one with the lower vo2 consumption at any given pace is more economical

i also find this relationship interesting because it also tells you why increasing vo2 max is valuable. vo2 max sort of represents near maximum / max effort and running at vo2 max typically can’t really be sustained for longer than 11 minutes. right now, the athlete can only run at their max for 11 minutes. if you shift the max up, here’s what happens

the previous velocity is now a fraction of max, so less effort is required to sustain the same pace. they can now race at that same pace for longer! better endurance
the new max is associated with higher velocity. their previous 11 minute high effort pace is even faster

precise vo2 max testing is typically done in a lab hooked up to an mask that measures oxygen consumption while running on treadmill at increasing intensity. one of my favorite running youtubers / olympic athlete is luis orta (venezuelan runner). he does a vo2 max test here and gets an 80 mL/kg/min.

that is a ridiculous number because the average vo2 max for untrained individuals are around 30 – 40!

vo2 max at the end of the day is just a metric / one indicator. i used to see vo2 max videos everywhere on youtube when i first started running and it made me feel like i somehow needed to track it as part of my training. completely untrue.

jack daniels types of running training

December 29, 2024December 29, 2024 alan Leave a comment

jack daniels classifies running training into four categories (see his lectures here). i’ll summarize here because i found it to be a helpful framework for building my own training program for the new year. each type adheres to the same general principle of minimum effort for the maximum gain. he says if you want to improve physiological function, you want to stress it. but you want to stress it at the lowest intensity of stress

easy runs

build aerobic base and ability to do higher volume runs
train at max stroke volume to gradually create cellular adaptations
- mitochondrial density
- fat oxidation
60% of max heart rate

threshold training

build endurance through pushing the lactate threshold. blood lactate accumulation happens at difference paces / effort levels. so goal is to push accumulation farther out relative to effort
- accumulation is function of how much produced vs how much cleared
- past the threshold is where speed of running beyond which blood lactate rises continuously instead of plateau
- at or below threshold = steady state lactate accumulation (not rising)
train at threshold means training at pace where any faster results in lactate rising continuously
82 – 88% of mhr
threshold is basically pace you can hold for roughly 1 hour

interval

purpose is to maximize aerobic power. how much blood is delivered and how much of that o2 is converted to energy
aerobic power is approximated via vo2 max
- o2 consumption measured by millilitres of oxygen per kilogram of the body mass per minute (e.g., mL/(kg·min)).
- vo2 max is max rate of oxygen consumption
97 – 100% of MHR

repetition

kind of like intervals (honestly not sure why he called this out separately), except the focus is on even higher intensity followed by long rest periods. purpose is to improve running economy

as you go from easy running to repetitions, the main variables within a training session that change are intensity and volume. easy runs are high volume, low intensity. on the other ends, repetitions and intervals are high intensity but low volume. this is a helpful lens through which to view running programs because the proportion of a type of training in a running program tells you the type of race or performance it’s effective for

while i really like doing threshold training, my current volume of training is low so right now i feel like i’m sacrificing base building when i really ought to aim at building more volume and developing a larger base. right now i do higher intensity training twice a week, but i may dial that back to just once a week and dedicate my other days to easy runs. it’s hard for me to do two intense sessions a week without feeling the impact on my joints / ligaments, particularly my right knee – which tells me i should probably scale back the intensity and just focus on volume

crafting interpreters chapter 17 notes – infix parsing with pratt parser

December 29, 2024December 29, 2024 alan Leave a comment

there’s a saying that all problems in computer science / programming can be solved by another level of indirection. in this chapter the pratt parser is a great example of that when it comes to parsing expressions such as

simple numeric literals i.e 1 or 2
single operand / prefix expressions like -1
binary expressions like 1 * 2 involving numeric, equality, comparison, or logical operators
any complex combination of the above with groupings

back in jlox, expression parsing was based on recursive descent expressions recursive descent. in this chapter, the parse sequence is driven by a special function called parsePrecedence. two new abstractions (the parse rule table and the rule lookup function) come together in the parsePrecedence function which is going to be the new entry point to expression parsing

static void parsePrecedence(Precedence precedence) {
  advance();
  ParseFn prefixRule = getRule(parser.previous.type)->prefix;
  if (prefixRule == NULL) {
    error("Expect expression.");
    return;
  }

  bool canAssign = precedence <= PREC_ASSIGNMENT;
  prefixRule(canAssign);

  while (precedence <= getRule(parser.current.type)->precedence) {
    advance();
    ParseFn infixRule = getRule(parser.previous.type)->infix;
    infixRule(canAssign);
  }

  if (canAssign && match(TOKEN_EQUAL)) {
    error("Invalid assignment target.");
  }
}

here’s a truncated example of some parse rules in our parse table. it’s a mapping of token types to a group of metadata (prefix parser, infix parser, and precedence level)

ParseRule rules[] = {
  [TOKEN_LEFT_PAREN]    = {grouping, call,   PREC_CALL},
  [TOKEN_RIGHT_PAREN]   = {NULL,     NULL,   PREC_NONE},
  [TOKEN_MINUS]         = {unary,    binary, PREC_TERM},
  [TOKEN_PLUS]          = {NULL,     binary, PREC_TERM},
  [TOKEN_NUMBER]        = {number,   NULL,   PREC_NONE},
};

unary is the prefix parsing function for the minus token. binary is the binary parsing function, and the precedence level of PREC_TERM. this is the getRule function that, given a token type, can retrieve that metadata

static ParseRule* getRule(TokenType type) {
  return &rules[type];
}

what’s unique about this approach is

the relevant parse function for a given token consumed via advance is fetched dynamically from the parse rule table. so given a token type of NUMBER for parser.previous.type, the first thing parsePrecedence attempts to do is locate the prefix function for that token
- other prefix functions may themselves call back to parsePrecedence such as grouping if a left parenthesis is encountered
for chained expressions involving infix operators i.e 1 + 2 + 3, the current precedence level is used to continue consuming the following expressions in a left-associative manner. so parsing 1 + 2 + 3 becomes ((1 + 2) + 3)
addition of new tokens involves setting a new token rule for those tokens and their metadata (prefix operator, infix operator if it applies, and precedence level). the parsePrecedence function automatically obeys the precedence levels during parsing. in jlox, parsing precedence has to be carefully managed by ensuring that it’s reflected in the call sequence (top down execution where lower precedence parse functions calling higher precedence ones)

unlike recursive descent top down parsers where the syntax reflects both the grammar and precedence order (lower precedence parse targets always invoke higher precedence ones), it’s harder to visualize the call sequence in a pratt parser because the exact call sequence is only apparent during runtime through calls to parsePrecedence (which decides how far to parse on the current precedence). nevertheless this seems like a more extensible / configurable way to manage expression rules

purpose of zone 2 easy runs

December 24, 2024December 24, 2024 alan Leave a comment

i went for an easy run this morning and was thinking about the purpose of training and zone 2. a cornerstone of pretty much any aerobic training program is the easy (zone 2, 60-70% of max heart rate or 5-6 RPE) run. there’s usually the long easy run combined with shorter easy runs throughout the week. when i first started training for longer races (15k), i thought the sole purpose of these longer runs was to progressively overload until i’m comfortable running the race distance. so if i’m training for a 15k, i’m increasing my ability to sustain a comfortable aerobic effort little by little until i’m able to do it for my desired distance.

if i’m training for a 5k, there must not really be a purpose of doing these longer runs. right? there’s a principle in training called specificity – basically it means you tailor your training to the specific energy system and skills that you are trying to improve. so if you’re trying to become a better long distance runner, run long distances. if you’re trying to become a better sprinter, sprint! this seems pretty intuitive, except what’s not obvious is that if you want to become a better runner at any distance, you also want to incorporate long runs!

base endurance

i’m not really an expert on physiology and there’s a ton of resources covering the benefits of long runs, but my layman understanding of this so far is that doing easy runs at roughly 60% of MHR is what allows you to

build your heart muscle (increasing stroke volume or how much blood can be pumped per beat) with minimal effort
these improvements are primarily a function of duration. so, generally speaking, the longer you are working your heart at that intensity the more of the benefits (up to a point, we can’t run forever without risking injury).
allow your body (muscles, bones, ligaments, joints, etc) to gradually adapt to higher volume
by doing easy runs at higher volume without injury, you unlock higher volume of more intense workouts into your schedule. someone who is comfortably running 30 miles a week can introduce a couple of 5k intense threshold runs into the week to build even more speed and endurance. if you’re doing 5 miles a week, there’s just no room for that. nothing wrong with running 5 miles a week, but my point here is to illustrate the relationship between steady state volume and training opportunity

the minimal effort point here is pretty key. you can train a far higher intensities to build your heart muscle, but turns out your hearts current maximum stroke volume is reached at 60% of MHR. so if you do a full out run, your stroke volume is still the same – you’re just expending more energy for the same heart muscle building benefits. also since doing high intensity runs all the time means you likely sacrifice on volume aka less time overall in this zone. people are also all different – in some situations there may be runners that can do very high volume and intensity and that works for them. i know that’s not me 😀

there are also numerous other related responses that support this gradual volume buildup of the heart muscle, a couple that i notice come up often are:

increase mitochondrial density (mitochondria generate energy in a cell using oxygen and glucose) so higher numbers of mitochondria means being able to use more of the available oxygen and glucose during aerobic activity

increase in ability to use fat stores as fuel instead of glycolysis, using glucose and oxygen (able to run longer)

so overtime, spending a lot of time in easy runs builds the heart muscle and its ability to pump out blood and increases your capacity to make use of that higher volume of blood per beat thanks to cellular level changes like mitochondrial density (more efficient). how this translates to races is that you’re able to do them at any distance without getting as tired because your aerobic system is more efficient. and because of the gradual buildup in your overall muscular strength you can run at higher volumes at a comfortable pace per week. this higher mileage then unlocks higher quality / higher volume intensity training.

jack daniels, a well known running coach, often says that you should know the purpose of your training. why are you running today? what is the purpose of this long run? well there’s the purpose of long runs. you do long easy runs because it builds the very foundation of your aerobic performance.

dyson v11 trigger repair & tips

December 24, 2024December 24, 2024 alan Leave a comment

back in January this year i ordered a refurbished dyson v11 off newegg (the full model name is V11 Animal+ Cordless Vacuum) for about $300 (new ones were close to $600) and it was working great up until end of November last month. the problem was that the trigger had stopped working – it wasn’t springing back into its normal position after depressing and wouldn’t turn on the vacuum anymore.

turns out this broken trigger on the v11 is a well known issue and it’s caused by a weak plastic arm / lever on the trigger assembly. it’s frustrating because why the hell would you made such a high use component that get subjected to repeated force out of thin plastic instead of metal? or at least make the plastic arm thicker so it doesn’t just crack in less than a year of use.

thankfully because this is such a common issue there were repair tutorials online and spare parts available through ebay. i was able to finally finish the repair yesterday and in this post i’ll share what resources i used and some tips (both for others and for myself in the future if i need to do this again…)

here’s the youtube video that documents the disassembly process and required tools. just a heads up, the trigger mechanism is embedded pretty deep and requires basically an entire disassembly of the vacuum. the video is less than five minutes long but i think it took me closer to 45min to get it all apart.

tips

you WILL need all the tools mentioned in the video. definitely the long torque screw and pliers. you won’t be able to remove the trigger assembly without a pair of pliers (i tried). it will also be helpful to have some kind of gripper (things that look like tweezers but for electronics, most electronic repair tool kits will come with this) to grip on to wires later during re-assembly

buy a new complete trigger assembly with metal switch (or at the very least a metal trigger piece to replace the plastic trigger with). yes it’s pretty funny that there’s apparently an entire market providing more durable switches for the v11 than dyson themselves. in my first go at this, i did what the video suggested and tried gluing the broken trigger with superglue. i do not recommend doing this because the trigger ended up breaking immediately again and i had to repeat the entire process. maybe i didn’t let it cure long enough. maybe my super glue wasn’t super enough. whatever, just save yourself the trouble and replace the entire assembly. below is an image of one i found on ebay (note that it says v10 – it’s also compatible with v11).

during reassembly, there will be a point where you need to straighten / bend the metal ends of the electric connectors in order to pass it through various parts of the vacuum. you’ll know what i’m talking about if you end up going through the full disassembly. try not to bend/re-bend them too many times because you can easily break off the metal ends (see below)

in my first pass at this after i had glued the trigger back together, i actually broke off the metal piece by accident when trying to bend it back and then spent over an hour trying to re-solder it back on. i also have no idea how to properly solder and ended up burning a hole in my table cloth. anyway when you’re re-connecting those metal connectors back, use your pliers to adjust them to be close to 90 degrees (as they were before you had to remove them) but it honestly doesn’t have to be perfect. just use the screws to tighten them against the motherboard.

lox vm scanner

December 20, 2024December 20, 2024 alan Leave a comment

in chapter 16 for the lox vm, the scanner implementation takes on a completely different approach compared to jlox. when we implemented jlox, the scanner did a full scan of the source file and then created all the tokens in memory for the parsing phase

in the C implementation, the file is still read but we don’t create a separate list for all the tokens by doing a full read of the file. instead the scanner refers directly to the source and we only create as many tokens as necessary (no more than 2 tokens since lox is a LLR1 type grammar that only requires a single token lookahead to uniquely identify a lexeme). this is a lazier and more memory efficient approach.

for example, here’s the scanner struct and how it’s initialized

 typedef struct {
   const char* start;
   const char* current;
   int line;
 } Scanner;

 Scanner scanner;

 void initScanner(const char* source) {
   scanner.start = source;
   scanner.current = source;
   scanner.line = 1;
 }

start refers to the beginning of a lexeme (say, an identifier)
current is the current character being scanned
there’s also some additional metadata like line number for debugging support

and this is the Token struct for representing a complete lexeme

typedef struct {
  TokenType type;
  const char* start;
  int length;
  int line;
} Token;

start is a pointer to the source – again we’re not allocating additional memory to hold token information
type is our special enum to things like TOKEN_IDENTIFIER

with the scanner and the token structs in place, the compiler drives the actual changes to these objects as it scans as much of the source code as it needs (and constructs tokens) to emit byte code sequences

ObjFunction* compile(const char* source) {
  initScanner(source);
  Compiler compiler;
  initCompiler(&compiler, TYPE_SCRIPT);

  parser.hadError = false;
  parser.panicMode = false;

  int line = -1;

  advance();

  while (!match(TOKEN_EOF)) {
    declaration();
  }

  ObjFunction* function = endCompiler();
  return parser.hadError ? NULL : function;
}

calls to advance and declaration both will eventually call out to scanToken which will make use of the scanner to read and construct the next token. for example if the token is a number, the compiler will emit two byte codes via a call to emitConstant(NUMBER_VAL(value));

the entire sequence of bytecodes is built this way, the compiler driving the scanner forward and emitting byte code sequences on the fly.

migrating a rails app from mongo to postgresql

December 12, 2024December 12, 2024 alan Leave a comment

my team and i recently completed a database migration from mongodb to postgresql for one of our rails apps. the service is a graphql api built on rails 7 and is backed by a mongodb database (m40 cluster managed through mongo’s atlas platform) with ~500gb of data and we performed a live zero-downtime migration to a db.m5.2xlarge RDS running in our own aws account . the application is organized like a pretty standard rails app. all data is represented by rails models and data access is done through an object mapping layer using mongos object document mapper (ODM) mongoid.

the requirements for this project were pretty straightforward

stop using mongo
dont take our service down to do an offline migration (given the amount of data we needed to move, the maintenance window we would need would’ve been way too long anyway based on some of our initial test)

our high level approach was to use the double writing pattern by dual writing to both data stores and put reads behind dynamic feature flags, backfill the tables one collection at a time, switch over the reads to the new database and then cut off the old read and writes.

this is a very common technique in service to service migrations when teams undertake monolith to microservice transitions (which were all the rage five to ten years ago, but the trend is reversing as of late) and the same process can be applied to switching data stores within the same service. the new reading/writing code in the service hit a new storage instead of the new api / service.

setup phase

we started by setting up an initial connection to postgres and added some basic tooling

set up the postgres database and the rails integration. our infrastructure teams spun up our new postgres instance on RDS sized comparably to the current storage on atlas. in the rails app, we setup active record ORM alongside the existing mongoid ODM and updated both our development and CI setup to spin up a postgres image

set up data transfer / backfilling utility scripts that extract mongo document data for a given collection and transform it into an postgres compatible format and inserted it into the postgres database. for example, nested documents become normalized foreign key relationships

set up feature flagging (we used flipper) to dynamically control the reading switch (double writing was not behind switches but we made sure to wrap our new writes with catch-all exception handling to never interrupt requests

double writes

we divvied up most of the work by resource types and tackled them in the order of some combination of entity complexity (lots of relationships, super nested) and data volume (getting an early start on the largest collections was important since we had deadlines to hit).

for each resource in the system, we did the following

create active record equivalents of the current ODM models. so this means bringing over model level unit tests, validations, and any database level constraints. to uniquely identify migrated data, we made sure to include a mongo_id column on every new table
set up dual writes. most of the writes happen through graphql mutation resolvers at the graphql API layer so this involves adding adjacent active record write logic.
duplicate existing unit and functional tests to cover the new models and code
set up the backfilling code. the shared migration script was sufficient for most of our data (simple batch read, transform, bulk insert), but a handful of our models with more complex entity relationships necessitated their own migration logic

backfill and read rollout

once dual writing was enabled for a while and we’re confident there are no issues with the new data, run the backfill scripts. depending on the collection, this took anywhere from minutes to days
upon backfill completion, verify the successful migration using a custom built data verifier script that ensures that all the mongo documents were successfully transferred. this script knew how to compare both simple flat docs and ones with very nested relationships by using rails model level reflection API
finally, switch the reads from mongo to postgres. this was done through flipper so no additional deploys are necessary

cleanup

once all dual writing is setup and all reads are done against postgres, remove the double writing and only keep our postgres active record reads and writes.
remove all traces of mongo
celebrate!

challenges

no project is without its challenges / setbacks and wow we had a number to deal with (and overcome!). we had issues on every stage of the sdlc

coordinating with other teams making changes to the service. we had to enact a code freeze since we were running into instances of people introducing new writes without the flags/dual writing stuff we required
wading through hard to understand business logic areas with low test coverage. we needed to create active record equivalents of a lot of writes, but some writes were fairly complex (very stateful, lots of conditions) and involved a coordination of multiple domain models
keeping the new active record models ,tests, scripts isolated. we can’t just delete the current application code so the new models needed to live alongside the old ones. since we wanted to preserve the model names as possible but you cannot have two models of the same name in models/ so we introduced a postgres namespace across the board to house the new code. this was a fantastic solution that made it both easy to add new models and delete the old ones later
database schema migration automation problems. we initially were running the new rails schema migrations by hand but when we switched over to automating the schema migration using k8s/helm, we accidentally made migrations run one off jobs (instead of pre-release hooks). as a result, we had deploys still succeeded despite failing migrations
some of our collections are large, so our backfill scripts need to run anywhere from several hours to several days. this increases the likelihood of running into issues mid data transfer, so it’s important for scripts to be idempotent and resumable. for the idempotent part, we did this by adding mongo_id primary key reference to all of our postgres tables to represent the identity of the mongo record migrated (in most backfilling instances with only a couple of exceptions, we skip the insert based on the mongo id if it’s already migrated). for resumability, during migration we always read mongo documents ordered by their primary key (lucky for us the first four bytes of the 12 byte id is the creation timestamp) and we log out the last key in the current batch during migration processing as a checkpoint to use later as a cursor
set off alerts when running backfills because of elevated read / writes against postgres which were in the call path of all existing requests. we ended up creating a read only mongo replica off of our primary in atlas to use for our backfilling. unfortunately, while this solved the contention issue we introduced new problems around data consistency. for example there was an instance where i ran the backfill against an outdated replica and ended up inserting stale records into the new database. luckily the verifier detected missing records and i was able to drop the table and re-run the backfill with a fresh / up to date database instance
missing mongo key constraints and existence of duplicate records. we had a number of collections containing dupes due to missing uniqueness indices, so when we added the appropriate uniqueness constraints to the new tables in postgres, the backfilling process blew up because the mongo data was bad. this required some data cleanup and one of my teammates wrote a handy de-duping script using mongos aggregation API to identify and remove dupes by gathering dupes for a any given document key combination into lists and then keeping the latest by purging the dupes.
- one minor snafu we ran into this was that the aggregation code does a lot of the grouping of documents in memory on a node and in one instance this caused a memory spike that impacted avg performance while the script was running
- based on the logs, we seem to get a good number of duplicate insert errors due to race conditions of requests attempting to modify the same resource at the same time, which probably explains why we had so many dupes in the old database to begin with. most of these cases can be ignored but it would be good to figure out why they’re happening so often
bad new data being inserted into our postgres database due to incorrect new code. for example, there was a situation where we were writing a UTC offset attribute into mongo through the ODM and when this got carried over to the active record class, it was only writing positive UTC offset values and excluding all negative offsets due to a bad guard clause i added. oopsie
- we also had minor and more suble bugs like timestamps not being properly updated. for example in active record we needed explicit .touch to update when no attributes changed but clients expected an updated timestamp. this was happening out of box with mongoid
data divergence happening in dual writing code during upserts that were caught by the verifier. for example, some records had fields that accrued values over time, but once dual writing got introduced and it got executed by a new request, only the most recent data in the payload is inserted into the new database (the original values accrued on a field in the mongo database were not carried over). unfortunately, this data gap wasn’t addressed by our backfilling because our backfilling code skips dual written records, so the historical values were never carried over for that record during that process.
- to illustrate this with a scenario: lets say a mongo record was created before dual writing and it’s field values gets value 1. time passes. we release the dual writing code. a new request wants to upsert the same record but this time with value 2. two writes happen: one to mongo, which ends up with [1,2] and one to postgres, which only has [2] (the most recent value).
- to fix these issues, we wrote one off data sync / repair tasks to fix these diverged records. this was pretty much an issue for any record that performs upserts and whose backfilling strategy was an insert_all (skips on conflict) are candidate for divergence.
contending with ongoing performance problems of the service trying to differentiate between whether degraded performance impacted by our new code or what was already there (turns out a little bit of both!)
on rolling out a read for a single high traffic collection, the entire service went down for a solid 5-10 minutes where i couldn’t access the flipper UI because none of the pods were responsive. turns out this was caused by missing indexes that was causing RDS CPU to be pegged at 100% due to full table scans happening in RDS against the collection

we did a pretty great job managing these issues as a team and right now we’re fully on postgres and it looks like it’s running smoothly so yay!

runners knee update

December 12, 2024December 12, 2024 alan Leave a comment

good news! the runners knee pain that i was experiencing back in november is no longer an issue. i’ve been clocking in 13-14 miles and slowly building back up to 15/16 miles per week the past two weeks and i haven’t been experiencing any pain around my patella. granted, i’ve been mostly been using assault treadmills at the gym (i got a 1 month membership to avoid the ice and snow of december) so that’s lower impact but i’ve also been running harder than usual so maybe it cancels out. I did spend a couple of weeks before that outside too so there’s good reason to think i’m pretty well recovered.

the funny thing is i think the thing that actually helped me was taking an entire week off running and ONLY doing strength training instead of doing both low intensity running AND strength training (specifically ones for quad strength building and my adductors). trying to do both was not actually working for me – i don’t think it was enough for the inflammation around my knee to actually subside. i live in a very hilly area so in reality even though i was doing low intensity, slower pace running i think i was still putting too much load on my knees.

so there you go, taking an entire week off running and focusing only on rehabilitation exercises was what finally helped. anyway here’s to another year of hopefully injury free running in 2026, peace.

my knee self diagnosis: patellafemoral syndrome (runners knee)

November 4, 2024November 4, 2024 alan 1 Comment

since increasing my weekly milelage from 10 to 16 i started noticing mild pain on the medial sides of both of my knee caps (my right more so than my left). i also added superfeet arch insoles into my shoes at around the same time, so that may have also affected my running mechanics.

from what i’ve been able to research, the most likely culprit is patellafemoral syndrome aka runners knee given the proximity of the pain to the knee cap. it’s on the medial side just underneath the knee cap. this hasn’t really seriously affected my daily mobility or even my running since it’s very mild, but it’s something i want to make sure i nip in the bud before it develops

here’s a table of common causes and which ones i believe apply to me

cause	applies?
kneecap misalignment	don’t know
overuse	most likely. 10 -> 16 is a 60% increase! recommended is closer to 10 – 15%
injury or trauma	no
week thigh muscles	possible – i haven’t incorporated quad strengthening into my routine yet
tight hamstrings	unlikely, esp. because i stretch these during yoga often
tight achilles tendons	maybe, i don’t stretch my achilles
poor foot support	could be affected by my new arch “supports” that may be throwing off my normal gait
feet rolling in	maybe? most of the roads and sidewalks i run on have camber/slope. when i run on the road, i run on the left so there’s a leftward slope which i’m sure affects my foot roll motion

of this set of causes, the top ones are likely

overuse
weak thigh muscles. quadriceps in particular play a huge role in knee cap stabilization and when the knee cap isn’t stable it’s more likely to result in inflammation when it’s not tracking smoothly
tight achilles tendons. tight achilles leads to more of a forefoot strike during walking, and that in turns causes the quads to remain tense and pull on the knee cap
feet rolling in. so the knee tends to go the same direction as the foot, but the quads will kick in and try to balance by pulling the knee cap in the opposite direction which can also lead to tracking issues

my current recovery plan is

incorporate quad strengthening exercises with focus on compound movements
- sumo squats
- bulgarian split squats
- squat jumps
- lateral jumps
stretch calfs and achilles post-run
reduce weekly miles from 16 to 14 or even back to 10-12 per week
- shift my current 8,2,3,3 pattern to 4,1,2,3 (halving my long run, then progressive increase throughout week)
remove the arch supports from my shoe (it’s an extra variable i don’t want to keep around…)
icing knees at end of day to reduce inflammation
knee cap mobilization exercises, also EOD

i’ll do another report in 3 weeks and let you know how it went!

xor and mod 2

November 4, 2024November 4, 2024 alan Leave a comment

so there’s a interesting property between the XOR operation and mod 2

turns out, the xor (^) of any sequence of bits is equal to the sum of those bits modulo 2

for example

1 ^ 0 ^ 1 ^ 1 is the same as (1 + 0 + 1 + 1) % 2

if you take this step by step, the xor side:

1 ^ 0 = 1

1 ^ 1 = 0

0 ^ 1 = 1 (answer)

the modulo side:

1 + 0 + 1 + 1 = 3

3 % 2 = 1

why?

lets look at the truth table for XORs using two bits

left bit	right bit	xor result
0	0	0
0	1	1
1	0	1
1	1	0

xor table

XOR is an exclusive OR, so it will only be 1 if there’s ONLY ONE bit that’s on. if there’s two bits or no bits, the result is 0. what other operation of two operands where the result is 0 given 0 and 0 and 1 and 1? modulo 2!

this equivalence exists because when we’re dealing with two bits, their sum is 2. 2 mod 2 is 0. when both bits are 0, the sum is 0 and 0 mod 2 is 0. when only one of them (and odd number) is on, we always get a sum of 1 and 1 mod 2 is 1

even though we’re only looking at two bits, this actually generalizes to any sequence of bits because it turns out that XORing any sequence of bits results in 0 when there is an even number of 1 bits and 1 when there is an odd number of 1 bits (or none)

short vs middle vs long distance

November 1, 2024December 12, 2024 alan Leave a comment

ever wondered what it means for a runner to be a “middle distance” or “long distance” runner? in the running / racing world there’s three main categories of distance events that differ by distance ranges

short or sprint distance

these are traditional 100 meter (100m), 200m, 400m, and the 4x100m and 4x400m relays. these are pretty much purely anaerobic events. anything beyond 400m is in the middle distance category where the running starts to demand both high aerobic and anaerobic work

medium distance

common track distances are the 800m, 1500m, milers (1609m) , 3000m and the steeple chase variations involving obstacles and water jumps. anything beyond 3000m is going to be long distance

long distance

this is where my current comfort level is with running, although i do most of my higher intensity work in the short distances. common races in this range are the 5000m or 5k (though some people also consider the 5k a medium distance event), 10k, half marathon (21k), marathon (42k), and beyond (ultra marathons) like a 50k (31 miles). pretty much most road racing and cross country running fall into long distance category.

the longest official race i’ve run so far is a super popular local 15k (https://www.boilermaker.com/). i’ve been running this race in the last 3 years. my impression is that the 15k is not a common race distance (compared to the 10k) because when i share this with people they always express surprise that such a distance is even a thing. my goal next year is to run a half marathon, so hopefully that will be my new long race record!

boilermaker fun fact: the boilermaker actually draws a good number of elite international runners – this past year the winner was john korir of kenya who’s one of the current top 10 marathon record holders!

boilermaker fun fact 2: not sure if this is verified, by i learned this through my wife. the event takes place in july, which seems odd because it’s a distance event that’s smack in the height of summer heat. but this is a couple of months before the marathon majors in the U.S (nyc, boston, chicago…) that run between september – november, so this off season schedule suits international runners that are training for the majors. i think this sort of makes sense because if they stuck the race in november, there’s probably going to be a non-existent elite pool…

anyway, here’s an easy / quick way to remember these ranges

short distance – up to a single lap on a standard outdoor track (400m)

medium distance – up to a 3k / two miles / 8 laps on a standard outdoor track

long distance – everything else

Week	Run 1	Run 2	Run 3	Long Run	Total Miles
1	3	3	3	6	15
2	3	3	3	7	16
3	3	4	3	7	17
4	3	4	3	5	15
5	3	4	3	9	19
6	3	5	3	10	21
7	3	5	3	7	18
8	3	6	3	12	24
9	3	6	3	10	22
10	3	6	3	8	20

Week	Run 1	Run 2	Run 3	Long Run	Total Miles
1	3	3	3	6	15
2	3	3	3	7	16
3	3	4	3	7	17
4	3	4	3	5	15
5	3	4	3	9	19
6	3	5	3	10	21
7	3	5	3	7	18
8	3	6	3	12	24
9	3	6	3	10	22
10	3	6	3	8	20

Week	Run 1	Run 2	Run 3	Long Run	Total Miles
1	3	3	3	6	15
2	3	3	3	7	16
3	3	4	3	7	17
4	3	4	3	5	15
5	3	4	3	9	19
6	3	5	3	10	21
7	3	5	3	7	18
8	3	6	3	12	24
9	3	6	3	10	22
10	3	6	3	8	20