java-在性能上,在什么时候用BufferedOutputStream包裹FileOutputStream有意义?

我有一个模块,负责读取,处理和将字节写入磁盘。 字节通过UDP传入,并且在组装完各个数据报之后,要处理并写入磁盘的最终字节数组通常在200字节至500,000字节之间。 有时,在组装后,会有字节数组超过500,000个字节,但这些相对很少见。

我目前正在使用BufferedOutputStreamBufferedOutputStream方法。 我还尝试将BufferedOutputStream包装在BufferedOutputStream中,包括使用接受缓冲区大小作为参数的构造函数。

看来使用BufferedOutputStream的性能趋于稍微好一些,但是我才刚刚开始尝试使用不同的缓冲区大小。 我只有一组有限的示例数据可以使用(示例运行中的两个数据集,可以通过我的应用程序进行传递)。 给定我所知道的数据信息,我是否可以运用一般的经验法则来尝试计算最佳缓冲区大小,以减少磁盘写入并最大程度地提高磁盘写入性能?

2个解决方案
32 votes

当写入小于缓冲区大小时,例如BufferedOutputStream会有所帮助。 8 KB。 对于较大的写入,它无济于事,也不会使其变得更糟。 如果您的所有写操作都大于缓冲区大小,或者每次写操作后始终都使用flush(),则我不会使用缓冲区。 但是,如果您的写入中有很大一部分小于缓冲区大小,并且您并非每次都使用flush(),那么值得这样做。

您可能会发现将缓冲区大小增加到32 KB或更大可以对您有所改善,或者使情况变得更糟。 青年汽车


您可能会发现BufferedOutputStream.write的代码很有用

/**
 * Writes <code>len</code> bytes from the specified byte array
 * starting at offset <code>off</code> to this buffered output stream.
 *
 * <p> Ordinarily this method stores bytes from the given array into this
 * stream's buffer, flushing the buffer to the underlying output stream as
 * needed.  If the requested length is at least as large as this stream's
 * buffer, however, then this method will flush the buffer and write the
 * bytes directly to the underlying output stream.  Thus redundant
 * <code>BufferedOutputStream</code>s will not copy data unnecessarily.
 *
 * @param      b     the data.
 * @param      off   the start offset in the data.
 * @param      len   the number of bytes to write.
 * @exception  IOException  if an I/O error occurs.
 */
public synchronized void write(byte b[], int off, int len) throws IOException {
    if (len >= buf.length) {
        /* If the request length exceeds the size of the output buffer,
           flush the output buffer and then write the data directly.
           In this way buffered streams will cascade harmlessly. */
        flushBuffer();
        out.write(b, off, len);
        return;
    }
    if (len > buf.length - count) {
        flushBuffer();
    }
    System.arraycopy(b, off, buf, count, len);
    count += len;
}
Peter Lawrey answered 2020-07-28T07:37:18Z
1 votes

我最近一直在尝试探索IO性能。 根据我的观察,直接写入FileOutputStream会产生更好的结果。 我将其归因于FileOutputStream的本机呼叫write(byte[], int, int)。此外,我还观察到,当BufferedOutputStream的延迟开始向直接FileOutputStream的延迟收敛时,它的波动幅度更大,即甚至可以突然翻倍(我尚未 能够找出原因)。

附言 我正在使用Java 8,现在无法评论我的观察是否对以前的Java版本适用。

这是我测试过的代码,我的输入是一个〜10KB的文件

public class WriteCombinationsOutputStreamComparison {
    private static final Logger LOG = LogManager.getLogger(WriteCombinationsOutputStreamComparison.class);

public static void main(String[] args) throws IOException {

    final BufferedInputStream input = new BufferedInputStream(new FileInputStream("src/main/resources/inputStream1.txt"), 4*1024);
    final ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
    int data = input.read();
    while (data != -1) {
        byteArrayOutputStream.write(data); // everything comes in memory
        data = input.read();
    }
    final byte[] bytesRead = byteArrayOutputStream.toByteArray();
    input.close();

    /*
     * 1. WRITE USING A STREAM DIRECTLY with entire byte array --> FileOutputStream directly uses a native call and writes
     */
    try (OutputStream outputStream = new FileOutputStream("src/main/resources/outputStream1.txt")) {
        final long begin = System.nanoTime();
        outputStream.write(bytesRead);
        outputStream.flush();
        final long end = System.nanoTime();
        LOG.info("Total time taken for file write, writing entire array [nanos=" + (end - begin) + "], [bytesWritten=" + bytesRead.length + "]");
        if (LOG.isDebugEnabled()) {
            LOG.debug("File reading result was: \n" + new String(bytesRead, Charset.forName("UTF-8")));
        }
    }

    /*
     * 2. WRITE USING A BUFFERED STREAM, write entire array
     */

    // changed the buffer size to different combinations --> write latency fluctuates a lot for same buffer size over multiple runs
    try (BufferedOutputStream outputStream = new BufferedOutputStream(new FileOutputStream("src/main/resources/outputStream1.txt"), 16*1024)) {
        final long begin = System.nanoTime();
        outputStream.write(bytesRead);
        outputStream.flush();
        final long end = System.nanoTime();
        LOG.info("Total time taken for buffered file write, writing entire array [nanos=" + (end - begin) + "], [bytesWritten=" + bytesRead.length + "]");
        if (LOG.isDebugEnabled()) {
            LOG.debug("File reading result was: \n" + new String(bytesRead, Charset.forName("UTF-8")));
        }
    }
}
}

输出:

2017-01-30 23:38:59.064 [INFO] [main] [WriteCombinationsOutputStream] - Total time taken for file write, writing entire array [nanos=100990], [bytesWritten=11059]

2017-01-30 23:38:59.086 [INFO] [main] [WriteCombinationsOutputStream] - Total time taken for buffered file write, writing entire array [nanos=142454], [bytesWritten=11059]
Dev Amitabh answered 2020-07-28T07:37:52Z
translate from https://stackoverflow.com:/questions/8712957/at-what-point-does-wrapping-a-fileoutputstream-with-a-bufferedoutputstream-make