Show HN: s3-lambda – Lambda functions over S3 objects; each, map, reduce, filter

s3-lambda

s3-lambda enables you to run lambda functions over a context of S3 objects. It has a stateless architecture with concurrency control, allowing you to process a large number of files very quickly. This is useful for quickly prototyping complex data jobs without an infrastructure like Hadoop or Spark.

At Littlstar, we use s3-lambda for all sorts of data pipelining and analytics.

Install

npm install s3-lambda --save

Quick Example

constS3Lambda=require('s3-lambda');// example optionsconstlambda=newS3Lambda({
  access_key_id:'aws-access-key',
  secret_access_key:'aws-secret-key',
  show_progress:true,
  verbose:true,
  max_retries:10,
  timeout:1000
});constbucket='my-bucket';constprefix='path/to/files/';

lambda
  .context(bucket, prefix)
  .forEach(object=> {// do something with object
  })
  .then(_=>console.log('done!'))
  .catch(console.error);

Setting Context

Before initiating a lambda expression, you must tell s3-lambda what files to operate over. You do this by calling context, which returns a promise, so you can chain it with the request. The context function takes four arguments: bucket, prefix, marker, limit, and reverse.

lambda.context(
  bucket, // the s3 bucket to use
  prefix, // the prefix of the files to use - s3-lambda will operate over every file with this prefix
  marker, // (optional, default null) start at this file/prefix
  limit,  // (optional, default Infinity) limit the # of files operated over
  reverse // (optional, default false) if true, operate over all files in reverse
) // .forEach() ... you can chain functions here

You can also provide an array of contexts like this

constctx1= {
  bucket:'my-bucket',
  prefix:'path/to/files/'// marker: 'path/to/files/somefile'
}constctx2= {
  bucket:'my-other-bucket',
  prefix:'path/to/files/'// marker: 'path/to/files/somefile'
}lambda.context([ctx1, ctx2]) // .map() ...

Lambda Functions

Perform synchronous or asynchronous functions over each file in a directory.

each
forEach
map
reduce
filter

each

each(fn[, isasync])

Performs fn on each S3 object in parallel. You can set the concurrency level (defaults to Infinity). If isasync is true, fn should return a Promise;

lambda
  .context(bucket, prefix)
  .concurrency(5) // operates on 5 objects at a time
  .each(object=>console.log(object))
  .then(_=>console.log('done!')
  .catch(console.error);

forEach

forEach(fn[, isasync])

Iterates over each file in a s3 directory and performs func. If isasync is true, func should return a Promise.

lambda
  .context(bucket, prefix)
  .forEach(object=> { /* do something with object */ })
  .then(_=>console.log('done!')
  .catch(console.error);

map

map(fn[, isasync])

Destructive. Maps fn over each file in an s3 directory, replacing each file with what is returned from the mapper function. If isasync is true, fn should return a Promise.

constaddSmiley=object=> object +':)';

lambda
  .context(bucket, prefix)
  .map(addSmiley)
  .then(console.log('done!'))
  .catch(console.error);

You can make this non-destructive by specifying an output directory.

constoutputBucket='my-bucket';constoutputPrefix='path/to/output/';

lambda
  .context(bucket, prefix)
  .output(outputBucket, outputPrefix)
  .map(addSmiley)
  .then(console.log('done!')
  .catch(console.error)

reduce

reduce(func[, isasync])

Reduces the objects in the working context to a single value.

// concatonates all the filesconstreducer= (previousValue, currentValue, key) => {return previousValue + currentValue
};

lambda
  .context(bucket, prefix)
  .reduce(reducer)
  .then(result=> { /* do something with result */ })
  .catch(console.error);

filter

filter(func[, isasync])

Destructive. Filters (deletes) files in s3. func should return true to keep the object, and false to delete it. If isasync is true, func returns a Promise.

// filters empty filesconstfn=object=>object.length>0;

lambda
  .context(bucket, prefix)
  .filter(fn)
  .then(_=>console.log('done!')
  .catch(console.error);

Just like in map, you can make this non-destructive by specifying an output directory.

lambda
  .context(bucket, prefix)
  .output(outputBucket, outputPrefix)
  .filter(filter)
  .then(console.log('done!'))
  .catch(console.error();

S3 Functions

Promise-based wrapper around common S3 methods.

list
keys
get
put
copy
delete

list

list(bucket, prefix[, marker])

List all keys in s3://bucket/prefix. If you use a marker, s3-lambda will start listing alphabetically from there.

lambda
  .list(bucket, prefix)
  .then(list=>console.log(list))
  .catch(console.error);

keys

keys(bucket, prefix[, marker])

Returns an array of keys for the given bucket and prefix.

lambda
  .keys(bucket, prefix)
  .then(keys=>console.log(keys))
  .catch(console.error)

get

get(bucket, key[, encoding[, transformer]])

Gets an object in s3, calling toString(encoding on objects.

lambda
  .get(bucket, key)
  .then(object=> { /* do something with object */ }
  .catch(console.error);

Optionally you can supply your own transformer function to use when retrieving objects.

constzlib=require('zlib');consttransformer=object=> {returnzlib.gunzipSync(object).toString('utf8');
}

lambda
  .get(bucket, key, null, transformer)
  .then(object=> { /* do something with object */ }
  .catch(console.error);

put

put(bucket, key, object[, encoding])

Puts an object in s3. Default encoding is utf8.

lambda
  .put(bucket, key, 'hello world!')
  .then(console.log('done!').catch(console.error);

copy

copy(bucket, key, targetBucket, targetKey)

Copies an object in s3 from s3://sourceBucket/sourceKey to s3://targetBucket/targetKey.

lambda
  .copy(sourceBucket, sourceKey, targetBucket, targetKey)
  .then(console.log('done!').catch(console.error);

delete

delete(bucket, key)

Deletes an object in s3 (s3://bucket/key).

lambda
  .delete(bucket, key)
  .then(console.log('done!').catch(console.error);

Show HN: s3-lambda – Lambda functions over S3 objects; each, map, reduce, filter

README.md

s3-lambda

Install

Quick Example

Setting Context

Lambda Functions

each

forEach

map

reduce

filter

S3 Functions

list

keys

get

put

copy

delete

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112