TC Oversight Controls

General Install Requirements

Anti-Virus Scanning

If not configured properly, regular anti-virus scanning may have a negative impact on both TC Oversight Controls and your data. Scanning certain files and directories could result in negative consequences, such as damaged or permanently lost data. After TC Oversight Controls is installed in your environment, anti-virus scanning must be configured in a way that certain files and directories are skipped over during a scan. Setting up exceptions in your anti-virus system is required in order to make sure data needed in the current flow is neither modified nor deleted.

The following sections will outline the best practices for clients who have already installed TC Oversight Controls and have active anti-virus policies in place.

TCOC Responsibilities

The following repositories should be excluded from anti-virus scans. Repository locations are determined in TCOC’s properties file and are configured when TCOC is installed.

Repository Location
Content Repository

• Holds content of current files being processed as well as any archived data.
./content_repository
FlowFile Repository

• Most crucial repository.

• Corruption of this repository results in data loss.
./flowfile_repository
Provenance Repository

• How much size you need depends on size of your dataflow, volume of data, and number of events you want to be able to retain.
./provenance_repository
Database Repository

• Relatively small repository.

• Flow configuration history and user database exists here.ts you want to be able to retain.
./database_repository
Log File Directory Logback.xml

Config Files

Technically Creative recommends avoiding active anti-virus systems that monitor access to the underlying disk systems used for metadata storage. These processes store data structures only; nothing is stored that is executable by the underlying operating system. As these processes can be quite active, potentially performing continuous writes against large files, the best performance requires direct, unimpeded access to the underlying filesystem. Any anti-virus system that traps filesystem calls will have a negative impact on system performance.

The following config files should be excluded from anti-virus scans.

Apache Hadoop HDFS Config File Setting where defined in core-site.xml and hdfs-site.xml
Namenode dfs.name.dir
Secondary Namenode fs.checkpoint.dir
Datanode dfs.datanode.dir
Tasktracker mapred.local.dir
Jobtracker mapred.local.dir
LOGS $HADOOP_LOG_DIR
/tmp/Hadoop hadoop.tmp.dir
Apache H Base Config File Setting where defined in hbase-default.xml
HBase tmp directory hbase.tmp.dir
HBase root directory Hhbase.root.dir
HBase local directory hbase.local.dir
YARN Resource Manager Config File Setting where defined in core-site.xml and hdfs-site.xml
NameNode HADOOP_NAMENODE_OPTS
DataNode HADOOP_DATANODE_OPTS
Secondary NameNode HADOOP_SECONDARYNAMENODE_OPT
ResourceManager YARN_RESOURCEMANAGER_OPTS
NodeManager YARN_NODEMANAGER_OPTS
WebAppProxy YARN_PROXYSERVER_OPTS
Map Reduce Job History Server HADOOP_JOB_HISTORYSERVER_OPTS
Apache Kafka Config File Setting where definted in hbase-default.xml
Kafka log directory Log.dir
Apache ZooKeeper Config File Setting where defined in zoo.cfg
ZooKeeper data directory dataDir=/var/lib/zookeeper

Linux File Systems

For Linux file systems, Technically Creative recommends raising the ulimit to 10,240 for all users of TCOC, HDFS, and HBASE.

Back-To-Top