On syncing data

I’m working on an ETL pipeline, which pulls from a REST API, does some transformations, and upserts it into a database. The REST API occasionally has invalid data (e.g. alphabet characters in the zip code, or misspelled US state names), and also occasionally throws HTTP 500 error pages.

I’m supposed to stall the entire pipeline whenever an error is thrown by the REST API, but the errors occur somewhat often (it’s relatively young software) and may take a few days to fix. Since the pipeline syncs fixed windows of data (one day at a time), there may be gaps if the pipeline is stalled for several days, which necessitates either a backfill task to be run manually, or making the daily job smart enough to run from the start time of the last successful job, rather than from a fixed time span back.

We have a daily task that syncs data updated in the past day, and a separate backfill task that syncs data from an optional start date (the beginning of time by default) to a specified end date, but maybe we should ditch the daily tasks and modify the backfill task to track its latest sync date and run daily. In hindsight, I think we should’ve done that from the beginning.

Testing data processing applications

We’re testing a data processing applications. There are multiple pipeline stages, the boundaries (i.e. inputs and outputs) of which are well defined. The bulk of the code deals with reshaping data from one form to another; there is very little functional logic. Therefore, testing should be focused mainly on how the code reacts to:

  • Null data – when certain fields have null/nil/None value
  • “Empty” data – when certain fields are populated with the empty string “”, which is distinct from null/nil/None.
  • Missing fields – when not just the value is missing, but the field (or column) is missing entirely from the input
  • Improperly formatted data – from fields where the data is slightly off, like different date string formats, all the way to fuzz testing.

Also to test:

  • Validate input schema – especially if you have no control over it
  • Validate output schema – This is basically a regression test

I think a test harness for this should:

  • Make it easy to maintain/refresh test data. This may involve pulling inputs from your data sources, but testing shouldn’t be interrupted if the refresh fails.
  • Have designated “base” input objects
  • Have API calls for modifying input and then validating the output without having to manually reset the input object
  • Make sure the output schema is valid
  • Make sure the output values fall within a valid range

Apple push notifications on Amazon SNS from node.js

We’re going to hook up three things today: Apple Push Notification Service (APNS), Amazon Simple Notification Service (SNS), and node.js. This has the potential to be a mega-post, so instead, I’m going to write an annotated checklist. Maybe later I’ll turn it into a multi-part series.


Let’s talk about how all of these are hooked together. First, we want to send push notifications to iOS to alert users that something happened. We must use APNS to push to iOS. But eventually, we may want to develop an Android version of our app, but then we need to configure a Google app for that (Google Firebase can send push notifications to both iOS and Android). But our focus is on Amazon SNS, not to mention that SNS also supports Baidu cloud messaging and Windows Phone. To interface with SNS, Amazon has a node.js library called aws-sdk.

Setting up Apple Push Notifications
  1. Create an App ID in Apple Dev Center with “Push Notifications” enabled. Create a certificate for development and production and download them.
  2. Add the App ID to a provisioning profile.
  3. In XCode, go to the project settings. Update the provisioning profile. That should associate the push notification certificates with your app.
  4. In XCode, go to the project targets. Go to the “Capabilities” heading and enable “Push Notifications”. Then scroll down to “Background Modes”, enable it and check the box for “Remote notifications”.
  5. In app code, get the device token. I’m not going into detail here, but if you’re writing natively, use didRegisterForRemoteNotificationWithDeviceToken, or if you’re using the react-native-push-notification package, add a handler for onRegister: function(token) { … }

The iOS simulator cannot receive push notifications, so device token calls on the simulator will always fail.

Setting up Amazon SNS

First, a word about SNS concepts. There are multiple avenues for a user to be subscribed for notifications: Topics and applications. Topics are for broadcasting to a group of users subscribed to the same topic. Applications are for sending notifications to specific endpoints. In this case, we’ll use applications.

  1. From the AWS console, go to the SNS dashboard. Go to “Applications”, then “Create Platform Application”. Enter an application name, then upload the APNS certificate that you downloaded from step 1 of setting up APNS, above. After choosing the .p12 file (you may have to export it from Keychain Access), click “Load credentials from file”, then finish up by clicking “Create platform application”. Take note of the ARN of the newly created platform application; you’ll need it later.
Getting aws-sdk on node.js
  1. In your node.js project, run npm install aws-sdk --save.
  2. Configure your AWS credentials: on Linux/MacOS, create a file called ~/.aws/credentials. On Windows, create a file called C:\Users\USER_NAME\.aws\credentials. Hopefully you still have your AWS credentials from when you created your AWS account, because this should be contents of the credentials file:
    aws_access_key_id = YOUR_ACCESS_KEY_ID;
    aws_secret_access_key = YOUR_SECRET_ACCESS_KEY
  3. At the top of your source files which will call SNS, add the following lines:
    var AWS = require('aws-sdk');
    AWS.config.update({region: 'us-west-2'});
    var sns = new AWS.SNS();

    Region should be filled in as appropriate. AWS calls will not work unless you configure the region.

Updating the device token from node.js

Before you chain all the SNS calls together, you might want to promisify them first. All the SNS calls are of the form sns.apiCall(params, callback) where params is an object containing the call parameters, and callback is a function(err, data). Promises are another topic I’m not going to detail, but here’s an example of a manually promisfied API – it will save you from Javascript callback hell:

function getEndpointAttributes(endpointArn) {
  return new Promise(function(resolve, reject) {
    var params = {
      EndpointArn: endpointArn
    sns.getEndpointAttributes(params, function(err, data) {
      if (err) {
  1. First, you should determine if you already have the user’s device token AND endpoint ARN stored someplace like your application database.
  2. If you don’t have the endpoint ARN, call sns.createPlatformEndpoint with the platform application ARN (step 1 from “Setting up Amazon SNS” above) and the user’s device token (you should code your mobile app to send this as a parameter),which will return the newly created endpoint ARN. The endpoint ARN is Amazon’s address for pushing notifications to your mobile app on a specific device. Save both the device token and endpoint ARN to some place like your application database.
  3. If you DO have the device token and endpoint ARN, compare the device tokens. Device tokens are mostly persistent, but are liable to changing once in a while. If the device token you received and the device token you stored are the same, there’s nothing else to do.
  4. If the device token you received and the device token you stored are different, then you need to delete the old endpoint by calling sns.deleteEndpoint with the old endpoint ARN, and then call sns.createPlatformEndpoint as detailed in step 2.
Sending a push notification
  1. Call sns.getEndpointAttributes with the user’s endpoint ARN. Make sure that the attributes have the “Enabled” property set to true, otherwise don’t send the notification.
  2. Call sns.publish with var params = { Message, Subject and TargetArn }. TargetArn should be the user’s endpoint ARN.

If the notification fails to send, the endpoint may have been silently disabled. SNS will disable an endpoint if the notification service (APNS or GCM) tells it that the device token is invalid. That may be caused by a bad certificate, or the device token expiring without a new one being uploaded for that user. To find out why, you can go to the SNS console, select your platform application, go to the “Actions” menu and select “Delivery status”. From that dialog, you can create an IAM role to log delivery failures to CloudWatch.

Simple UI Animations with react-native

Facebook’s react-native platform has an animation API that lets you animate Text, Image or View components. You can animate properties of the components, then make them play on an event, chain them together sequentially, or play them all together.

Styles you can animate:

Here are some stylesheet properties of components that you can animate:

  • Opacity
  • Width
  • Height
  • Translation offsets
  • Rotation
Animation variables:

One-dimensional values to be animated (such as opacity or height) are stored in an Animated.Value object, while two-dimensional values (such as XY translation) are stored in an Animated.ValueXY object. These are initialized like this:

import { 'Animated' } from react-native;
this.state = {
  opacityValue: new Animated.Value(255),
  translation: new Animated.ValueXY(0, 0)
Specifying animation types

There are three ways to procedurally change the value of an animation variable: spring, decay, and toValue. To call these functions, you must specify which variable is being configured, and a configuration object, with parameters described below. In addition, the target value at the end of the animation is called toValue.

  • Spring is straightforward to use; it causes the value to bounce, and you can optionally set the friction (“bounciness”) and tension (“speed”).
  • Decay is also straightforward to use; it’s simply an exponential decay function where you set the rate of decay (e.g., 0.97) and the initial velocity, which is required.
  • timing has 3 parameters that you can set: delay and duration are obvious, but the third one, easing, requires a deeper understanding to use. The default easing function is linear, but if you want to override it to something like Easing.sin(t) or Easing.bezier(x1, y1, x2, y2), you must add/integrate the line “import { ‘Easing’ } from ‘react-native'” to the top of your source code. The easing functions you can use are here.

Example to get something to fade out over 200ms:

Animated.timing(this.state.opacity, {toValue: 0, duration: 200})

You can also set the value directly simply by calling Animated.setValue().

Assigning animation variables to a component

Using the above example of opacity, just set a stylesheet property:

render() {
  let myStyle = {
    opacity: this.state.opacityValue,
  return (
    <Animated.View style={myStyle}>

Note how <Animated.View> is used instead of <View>. For Animated.ValueXY variables, you may have to get the X and Y values directly by accessing myXYValue.x._value and myXYValue.y._value.

Calling the animation

To start an animation, you simply call .start() on the Animated object, optionally passing a callback function.

  {toValue: 0, duration: 200}).start(() =>
    Alert.alert('Animation done!'))

The callback function is how you get animations to loop. At the time of writing, react-native does not have built-in parameters for looping an animation.

Sequential and parallel animations

You can start animations in sequence or in parallel by calling Animated.sequence() or Animated.parallel() with an array of Animation calls. It’s better explained by an example:

  Animated.spring(this.state.heartSize, { toValue: 1, friction: 0.7, tension: 0.4 }),
  Animated.spring(this.state.circleSize, { toValue: 0.5, friction: 1.0, tension: 0.3 }),
  • Create Animation variables by using Animated.Value()
  • Assign Animation variables to stylesheet properties attached to <Animated.View>, <Animated.Text> or <Animated.Image> tags
  • Configure an animation by calling Animated.timing(), Animated.spring(), or Animated.decay()
  • Start an animation by calling .start() on the configured animation
  • Start animations in sequence or parallel by wrapping multiple animations in an array and passing the array as a parameter to Animation.sequence() or Animation.parallel()
  • For more details, see the Animated API reference or the react-native Animations guide
Full code example
import React, { Component } from 'react';
import { Alert, Animated, Button, Text, View } from 'react-native';

export default class AnimationView extends Component {

  constructor(props) {
    this.state = {
      opacityValue: new Animated.Value(1),
      heartSize: new Animated.Value(0.5),
      circleSize: new Animated.Value(1.0),

  render() {
    let textStyle = {
      opacity: this.state.opacityValue
    let heartStyle = {
      transform: [{scale: this.state.heartSize}]
    let circleStyle = {
      transform: [{scale: this.state.circleSize}]
    return (
          <Button onPress={() => {
              Animated.timing(this.state.opacityValue, { toValue: 0, duration: 1000 }).start(() => Alert.alert('Animation done!'));
            title="Fade text away"
            color="powderblue" />
          <View style={{backgroundColor: "powderblue"}}>
              <Animated.Text style={textStyle}>Fade away</Animated.Text>

          <Button onPress={() => {
                Animated.spring(this.state.heartSize, { toValue: 1, friction: 0.7, tension: 0.4 }),
                Animated.spring(this.state.circleSize, { toValue: 0.5, friction: 1.0, tension: 0.3 }),
            title="Initiate love"
            color="pink" />
          <View style={{flexDirection: "row", backgroundColor: "white"}}>
              <Animated.Image style={heartStyle} source={require('./heart.png')}/>
            <Animated.Image style={circleStyle} source={require('./circle.png')}/>