Re. the cmd shell error: The Jython instance that is bundled with Data360 Analyze is not intended or configured to provide shell access. The above error is not associated with attempting to use the pip module as it would also be generated by commands such as jython --version
Before commenting on your other questions, I need to state the following caveats:
- Only modules that are part of the Jython standard library or are shipped as part of the Data360 Analyze are covered by the Infogix support policy.
- When an Analyze instance is upgraded, any packages installed into the Jython site-packages directory will be lost. For this reason it is recommended that any custom packages are installed in a directory outside of the Analyze installation directory. The directory used for the local repository would not be on the Python library path so it is necessary to add the directory to the path for all nodes that utilize locally installed packages.
- When a package is installed by pip the normal operation is to also attempt to install any dependent packages. The primary package you want to install could be a pure Python package but the package may depend on other packages that are written in C/C++ which can lead to an installation failure. Installing non-pure Python packages on Linux typically requires the relevant compiler to be installed in the OS.
- Not all packages have been written to the required standards and some packages may not operate correctly when they are installed in a custom directory rather than the site-packages directory. Indeed, some packages may still fail to operate even when installed into the default site-packages directory.
The version of Jython bundled with Analyze includes pip. This can be accessed from within the scripts of a Transform node. The following provides an example of how the pure-Python PyPDF2 package can be installed by leveraging pip within a Transform node:
The pip module is not intended to be imported programmatically from within Jython. In the example ConfigureFields script above, pip is called using the subprocess module.
After importing the required libraries the script defines the target directory where the package is to be installed. The script then constructs the target parameter for the pip install command.
A function is defined to call pip within a subprocess.
After defining the name of the package to be installed the function is called to perform the installation, The standard output from the installation command is captured so it can be output by the node as confirmation of what was installed.
The ProcessRecords script is then used to output the results.
As previously mentioned, any node that needs to leverage a package installed into the local repository must include the path in the Jython library search path. Below is an example of how the repository's path can be appended to the search path:
Attached files
Jython Transform Node Script to Install a Pure-Python Package in a Custom Repository.txt
Jython Transform Node Script to Import Package from Custom Repository.txt