Lightcast offers the world’s most comprehensive data and insights on labor market trends, job postings, workforce profiles, compensation, career pathways, skills projections, and demographics. Following the launch of Lightcast Data Shares, slicing, dicing, and analyzing large datasets is more seamless than ever before.
Want easy access to Lightcast data?
Talk to an expert to learn how
What are Lightcast Data Shares?
Through Lightcast Data Shares, Lightcast data is now available directly on Google Cloud Storage (GCS) and other top file storage platforms, data warehouses, and marketplaces.
Data sharing through Lightcast is flexible and secure. Lightcast makes the process of sharing data simple for clients and they also have the option to slice, edit, or prepare data before integrating it into internal or external solutions. Because data is shared directly with each client’s existing warehouse or cloud storage as platform-native sharing, a data feed, or both, information shared through Lightcast Data Shares is always up-to-date.
Accepting data shares from Lightcast requires almost no internal technical resources from customers, resulting in lower costs and increased efficiencies. Data shares streamline the process of using Lightcast data for ingestion into large language models (LLMs), AI solutions, and predictive analytics models.
Depending on each client’s specific business needs, Lightcast Data Shares can also be customized. Each data share integration can be fully operational within two business days.
How Data Shares and Google Cloud Storage Work Together
Google Cloud Storage (GCS), an object storage service offered by Google Cloud Platform, is one of several delivery destinations supported by Lightcast Data Shares. Lightcast has very flexible support for Google Cloud Storage Buckets, including control over:
Bucket region
Whether the bucket is created and managed by Bobsled, or is an external bucket managed by you or a partner
Access control. The permissions granted allow each Google Principal to perform read and copy operations on the Bobsled-managed destination bucket
What if I Don’t Use Google Cloud Storage?
Lightcast supports several data share destinations beyond Google Cloud Storage, including Snowflake, Databricks, Google BigQuery, Microsoft Azure Blob Storage, SFTP, and Amazon S3.
How to Connect Google Cloud Storage with Lightcast Data Shares
Before configuring a destination, a share must be created. Lightcast requires Google Principal(s) to be provided to grant access to the data in the Lightcast-managed destination.
Before consuming a data transfer, a data transfer must be sent to the destination and access must be configured in Lightcast for the identity that is consuming the data.
Consuming a data transfer
From the Shares list page, click on the share that you would like to access.
Once a data transfer in the share has been completed, select the button Access Data.
Option 1: Accessing Data via Web Console
You can easily access the data using the web console link to view and download the data. To do so, please be sure to log in to the [insert cloud’s console name] with the account that has been configured to access the Bobsled share.
Select the Web console tab in the access dialog
Click the link icon to view the data in the GCP Web Console.
Option 2: Accessing Data via Command line
Using the Google Cloud Storage command-line tools (Gsutil or Gcloud), you can list, copy, and sync the contents of the data transfer in Google Cloud Storage. To use the following commands, you will need to copy the Cloud Storage URI located in the access data dialog as pictured above.
Login to the GCP Command-line
1. Run the login command
The login command for Google Cloud Storage is gcloud auth login
If you would like to access the data as a service account, run the command:
gcloud auth activate-service-account [service-account-email] --key-file=[path-to-private-key-file]
Access via Gcloud command-line tool
1. List the contents
To list the data in the bucket, you will use the command gcloud ls
gcloud storage ls -r <storage-bucket-URI>
Parameters used:
-r
(recursive) lists all objects in a bucketOptional parameters to use with the list command:
-l
(info) additional information about the bucket (object size, creation time, etc)
2. Copy the contents to your bucket
To copy the data in the bucket, you will use the command gcloud cp
gcloud storage cp -r <storage-bucket-URI> <your bucket/path>
Parameters used:
-r
(recursive) copies entire directory treeOptional parameters to use with the copy command:
-n
(no-clobber) prevent overwriting the content of existing files at the destination.
3. Sync the contents
Please use gsutil
to sync the contents of the data transfer to your bucket.
Access via Gsutil command-line tool
1. List the contents
To list the data in the bucket, you will use the command gsutil ls
gsutil ls -r <storage-bucket-URI>
Parameters used:
-r
(recursive) lists all objects in a bucketOptional parameters to use with the list command:
-l
(info) additional information about the bucket (object size, creation time, etc)
2. Copy the contents to your own bucket
To copy the data in the bucket, you will use the command gsutil cp
gsutil cp -r <storage-bucket-URI> <your bucket/path>
Parameters used:
-r
(recursive) copies the entire directory treeOptional parameters to use with the copy command:
-n
(no-clobber) prevent overwriting the content of existing files at the destination
3. Sync the contents
Use sync if you would like to copy only files that are new or updated. Learn more about the command rsync
gsutil rsync <storage-bucket-URI> <your bucket/path>
Parameters used:
-r
(recursive) copies the entire directory tree
Learn more about how Lightcast Data Shares work or get in touch with our team to discuss your specific business requirements and connect Google Cloud Storage with Lightcast Data Shares.