Upgrade, redeploy, or uninstall Soda
Last modified on 26-Sep-24
Migrate from self-hosted to Soda-hosted agent
Redeploy a self-hosted Soda Agent
Upgrade a self-hosted Soda Agent
Upgrade to Soda Agent 1.0.0 or greater
Upgrade a Soda Library
Uninstall Soda Library
Migrate from Soda Core
Migrate a data source from a self-hosted to a Soda-hosted agent
If you already use a self-hosted Soda Agent deployed in a Kubernetes cluster to connect to your data source(s), you have the option of migrating a connected data source to a Soda-hosted agent. Though you must reconfigure your data source connection to the new Soda agent, your checks, check history, and scan definition remain intact.
- Be aware that Soda-hosted agents are only compatible with the following data sources: BigQuery, Databricks SQL, MS SQL Server, MySQL, PostgreSQL, Redshift, Snowflake.
- 🔴 When you migrate to a Soda-hosted agent, Soda Cloud resets all the connection configuration details for your data source. Be sure to capture all existing data source connection details before migrating so you can re-enter the details for the data source connection.
- As a user with permission to do so in Soda Cloud, navigate to your avatar > Organization Settings. In the Organization tab, click the checkbox to Enable Soda-hosted Agent.
- Navigate to your avatar > Data Sources, then access the Agents tab. Notice your out-of-the-box Soda-hosted agent that is up and running.
- Navigate to the Data Sources tab, then click to select the data source you wish to migrate to the Soda-hosted agent.
- In the 2. Connect the Data Source tab, copy+paste the contents of the editing panel to a temporary, secure, local place in your system. Switching agents resets all connection configuration parameters, so be sure to record existing parameter settings before proceeding.
- In the 1. Attributes tab, use the dropdown for Default Scan Agent to select
soda-hosted-agent
. - Return to the 2. Connect the Data Source tab, then, using the configuration values you recorded in step 3, use the dropdowns to re-enter the values, then Test Data Source.
- When the test completes successfully, Save your changes to the data source.
Redeploy a self-hosted Soda Agent
The Soda Agent is a tool that empowers Soda Cloud users to securely access data sources to scan for data quality. Create a Kubernetes cluster in a cloud services provider environment, then use Helm to deploy a sefl-hosted Soda Agent in the cluster. Read more.
When you delete the Soda Agent Helm chart from your cluster, you also delete all the agent resources on your cluster. However, if you wish to redeploy the previously-registered agent (use the same name), you need to specify the agent ID in your override values in your values YAML file.
- In Soda Cloud, navigate to your avatar > Agents.
- Click to select the agent you wish to redeploy, then copy the agent ID of the previously-registered agent from the URL.
For example, in the following URL, the agent ID is the long UUID at the end.https://cloud.soda.io/agents/842feab3-snip-87eb-06d2813a72c1
.
Alternatively, if you use the base64 CLI tool, you can run the following command to obtain the agentID.kubectl get secret/soda-agent-id -n soda-agent --template={{.data.SODA_AGENT_ID}} | base64 --decode
- Open your
values.yml
file, then add theid
key:value pair underagent
, using the agent ID you copied from the URL as the value.soda: apikey: id: "***" secret: "***" agent: id: "842feab3-snip-87eb-06d2813a72c1" name: "myuniqueagent"
- To redeploy the agent, you need to provide the values for the API keys the agent uses to connect to Soda Cloud in the values YAML file. Access the values by running the following command, replacing the
soda-agent
values with your own details, then paste the values into your values YAML file.helm get values -n soda-agent soda-agent
Alternatively, if you use the base64 CLI tool, you can run the following commands to obtain the API key and API secret, respectively.
kubectl get secret/soda-agent-apikey -n soda-agent --template={{.data.SODA_API_KEY_ID}} | base64 --decode
kubectl get secret/soda-agent-apikey -n soda-agent --template={{.data.SODA_API_KEY_SECRET}} | base64 --decode
- In the same directory in which the
values.yml
file exists, use the following command to install the Soda Agent helm chart.helm install soda-agent soda-agent/soda-agent \ --values values.yml \ --namespace soda-agent
- Validate the Soda Agent deployment by running the following command:
kubectl describe pods
Upgrade a self-hosted Soda Agent
The Soda Agent is a Helm chart that you deploy on a Kubernetes cluster and connect to your Soda Cloud account using API keys.
To take advantage of new or improved features and functionality in the Soda Agent, including new features in the Soda Library, you can upgrade your agent when a new version becomes available in ArtifactHub.io.
Note that there is no downtime associated with the exercise of upgrading a self-hosted Soda Agent. Because Soda does not define the .spec.strategy
in the deployment manifest of the Soda Agent Helm chart, Kubernetes uses the default RollingUpdate
to upgrade; refer to Kubernetes documentation .
- If you regularly access multiple clusters, you must ensure that are first accessing the cluster which contains your deployed Soda Agent. Use the following command to determine which cluster you are accessing.
kubectl config get-contexts
If you must switch contexts to access a different cluster, copy the name of cluster you wish to use, then run the following command.
kubectl config use-context <name of cluster>
- To upgrade the agent, you must know the values for:
- namespace - the namespace you created, and into which you deployed the Soda Agent
- release - the name of the instance of a helm chart that is running in your Kubernetes cluster
- API keys - the values Soda Cloud created which you used to run the agent application in the cluster
Access the first two values by running the following command.helm list
Output:
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION soda-agent soda-agent 5 2023-01-20 11:55:49.387634 -0800 PST deployed soda-agent-0.8.26 Soda_Library_1.0.0
- Access the API key values by running the following command, replacing the placeholder values with your own details.
helm get values -n <namespace> <release name>
From the output above, the command to use is:
helm get values -n soda-agent soda-agent
- Use the following command to search ArifactHub for the most recent version of the Soda Agent Helm chart.
helm search hub soda-agent
- Use the following command to upgrade the Helm repository.
helm repo update
- Upgrade the Soda Agent Helm chart. The value for the chart argument can be a chart reference such as
example/agent
, a path to a chart directory, a packaged chart, or a URL. To upgrade the agent, Soda uses a chart reference:soda-agent/soda-agent
.helm upgrade <release> <chart> --set soda.apikey.id=*** \ --set soda.apikey.secret=****
From the output above, the command to use is
helm upgrade soda-agent soda-agent/soda-agent \ --set soda.apikey.id=*** \ --set soda.apikey.secret=****
OR, if you use a values YAML file,
helm upgrade soda-agent soda-agent/soda-agent \ --values values-local.yml --namespace soda-agent
Upgrade to Soda Agent 1.0.0 or greater
Soda Agent 1.0.0 includes several key changes to the way the Soda Agent works. If you already use a Soda Agent, carefully consider the changes that Soda Agent 1.0.0 introduces and make appropriate changes to your configured parameters.
Soda Agent 1.0.0 favors manged or self-managed node groups over AWS Fargate, AKS Virtual Nodes, or GKE Autopilot profiles. Though this version of the agent still works with those profiles, the scan performance is slower because the profiles provision new nodes for each scan. To migrate your agent to a managed node group:
- Add a managed node group to your Kubernetes cluster.
- Check your cloud-services provider’s recommendations for node size and adapt it for your needs based on volume of scans you anticipate. Best practice dictates that you set your cluster to have at least 2 CPU and 2GB of RAM, which, in general is sufficient to run up to six scans in parallel.
- Upgrade to Soda Agent 1.0.0, configuring the helm chart to not use Fargate, Virtual Nodes, or GKE Autopilot by:
- removing the
provider.eks.fargate.enabled
property, or setting the value tofalse
- removing the
provider.aks.virtualNodes.enabled
property, or setting the value tofalse
- removing the
provider.gke.autopilot.enabled
property, or setting the value tofalse
- removing the
soda.agent.target
property
- removing the
- Remove the Fargate profiles, and drain existing workloads from virtual nodes in the namespace in which you deployed the Soda Agent so that the agent uses the node group to execute scans, not the profiles.
Upgrade Soda Library
To upgrade your existing Soda Library tool to the latest version, use the following command, replacing redshift
with the install package that matches the type of data source you are using.
pip install -i https://pypi.cloud.soda.io soda-redshift -U
Uninstall Soda Library
- (Optional) From the command-line, run the following command to determine which Soda packages exist in your environment.
pip freeze | grep soda
- (Optional) Run the following command to uninstall a specific Soda package from your environment.
pip uninstall soda-postgres
- Run the following command to uninstall all Soda packages from your environment, completely.
pip freeze | grep soda | xargs pip uninstall -y
Migrate from Soda Core
Soda Core, the free, open-source Python library and CLI tool upon which Soda Library is built, continues to exist as an OSS project in GitHub. To migrate from an existing Soda Core installation to Soda Library, simply uninstall the old and install the new from the command-line.
- Uninstall your existing Soda Core packages using the following command.
pip freeze | grep soda | xargs pip uninstall -y
- Install a Soda Library package that corresponds to your data source. Your new package automatically comes with a 45-day free trial. Our Soda team will contact you with licensing options after the trial period.
pip install -i https://pypi.cloud.soda.io soda-postgres
- If you had connected Soda Core to Soda Cloud, you do not need to change anything for Soda Library to work with your Soda Cloud account.
If you had not connected Soda Core to Soda Cloud, you need to connect Soda Library to Soda Cloud. Soda Library requires API keys to validate licensing or trial status and run scans for data quality. See Configure Soda for instructions. - You do not need to adjust your existing
configuration.yml
orchecks.yml
files which will continue to work as before.
Go further
- Learn more about the ways you can use Soda in Use case guides.
- Write custom SQL checks for your own use cases.
- Need help? Join the Soda community on Slack.
Was this documentation helpful?
What could we do to improve this page?
- Suggest a docs change in GitHub.
- Share feedback in the Soda community on Slack.
Documentation always applies to the latest version of Soda products
Last modified on 26-Sep-24