Scalable File Service

Performance

2024-09-11 07:28:30

What factors influence the read/write speed of a file system?

The read/write speed is affected by both bandwidth and IOPS. For large file services, bandwidth has a greater effect. For small file services, IOPS has a greater effect. The maximum read/write performance for a single file system depends on the file system type and size. See Product Specifications - SFS.

What are the performance indicators of the file system?

The file system has three performance indicators: IOPS, bandwidth, and latency.

Ÿ   IOPS (Input/Output Per Second) denotes the number of IO operations (reads and writes) per second. For scenarios involving frequent small file operations, IOPS is the primary indicator.

Ÿ   Bandwidth refers to the maximum data transmission traffic per unit of time, which is a crucial indicator for scenarios involving access to large files.

Ÿ   Latency measures the time spent on a single read/write operation. Since large IOs may include multiple reads/writes, the average latency of small IOs applies in general cases. This indicator is greatly affected by network status and the busy degree of the file system.

How can I enhance NAS access performance on a Linux operating system?

By default, the NFS client allows for two simultaneous NFS requests, which seriously affects performance. Modifying sunrpc.tcp _ slot _ table _ entries can boost standalone NAS access throughput. We recommend that you modify this parameter value to 128.

echo "options sunrpc tcp_slot_table_entries=128" >> /etc/modprobe.d/sunrpc.conf
echo "options sunrpc tcp_max_slot_table_entries=128" >>  /etc/modprobe.d/sunrpc.conf
sysctl -w sunrpc.tcp_slot_table_entries=128

Ensure that you run the above commands before the first mount, which takes effect permanently.

Note:

However, increasing the number of concurrent NFS requests may elevate the latency of individual IO operations. Please adjust it according to service needs.

Why does a file system respond slowly or does not respond when I run the ls command?

By default, ls traverses all files in the directory, obtains the metadata information of the files and presents it to the user. If the directory is too large, such as containing 100,000 files, it may need to issue 100,000 read instructions, which take a long time.

Solution:

Ÿ   Avoid a single directory containing too many files. We recommend that you store less than 10,000 files in a single directory.

Ÿ   When running ls, use the full path /usr/bin/ls without adding the --color=auto parameter, which can avoid traversing the files in the directory and greatly reduce the number of read instructions.

Why does the number of files created per second not reach the nominal value of IOPS when I create files concurrently under a directory?

Creating a file involves at least two IO instructions: Allocate disk space for a new file and Add a new file to a directory:

Ÿ   The Allocate disk space for a new file instruction can be performed concurrently, and the degree of concurrency is influenced by the size of the file system. A larger file system typically allows for higher concurrency.

Ÿ   The Add a new file to a directory instruction cannot be performed concurrently if the same directory is modified. The modification speed is greatly affected by the IO latency. For example, if the file system latency is 1 ms, and 1,000 IO operations can be completed within one second without concurrency. Therefore, the creation performance of a single directory does not exceed 1,000 files per second.

Solution:

Ÿ   Avoid a single directory containing too many files. We recommend that you store less than 10,000 files in a single directory.

Ÿ   Expanding the file system capacity can improve the read/write performance of the file system.

How can I address latency issues when writing data to NFS file systems mounted on multiple cloud servers?

Issue Description: Cloud Server 1 updates File A but Cloud Server 2 still fetches the old content after immediate reading.

Cause: This issue can be attributed to two main reasons: The first reason is that Cloud Server 1 does not perform flushing immediately after writing file A, but performs PageCache operation first, relying on the application layer to call fsync or close to flush. The second reason is that Cloud Server 2 has a file cache and may not immediately fetch the latest content from the server. For example, if Cloud Server 1 updates File A while Cloud Server 2 has already cached the data, Cloud Server 2 still uses the old content in the cache when reading again.

Solution:

Option 1: After Cloud Server 1 updates the file, make sure to run close or call fsync. Before Cloud Server 2 reads the file, reopen the file before reading it.

Option 2: Turn off all caches for Cloud Server 1 and Cloud Server 2. This can lead to poor performance. Please select the appropriate solution based on your actual business situation.

Turn off the caches for Cloud Server 1: When mounting, add the noac parameter to ensure that all writes are spilled to the disk immediately. Here is a mount command example:

mount -t nfs -o vers=3,proto=tcp,async,nolock,noatime,nodiratime,wsize=1048576,rsize=1048576,timeo=600,noac Mount address: Local mount path 1

Turn off the caches for Cloud Server 2: When mounting, add the actimeo=0 parameter, ignoring all caches. Here is a mount command example:

mount -t nfs -o vers=3,proto=tcp,async,nolock,noatime,nodiratime,wsize=1048576,rsize=1048576,timeo=600,actimeo=0 Mount address: Local mount path 2

Select the above solution based on your actual situation. Make sure that Cloud Server 2 can immediately fetch the latest content after Cloud Server 1 updates the file.


dLPDFnuDoFmj