Fine tuning of the buffer size is well known technique to improve the latency and throughput in the network. However, it is difficult to achieve because the microscopic traffic pattern changes dynamically and affected by many factors in the network, and fine grained information to predict the optimum buffer size for the upcoming moment is difficult to calculate. To address this problem, this paper proposes a new approach, Buffer Optimization using Reinforcement Learning (BO-RL), which can dynamically adjust the buffer size of routers based on the observations on the network environment including routers and end devices. The proof of concept implementation was developed using NS-3, OpenAI Gym and TensorFlow to integrate the Reinforcement Learning (RL) agent and the router to dynamically adjust its buffer size. This paper reports the working of BO-RL, and the results of preliminary experiments in a network topology with limited number of nodes. Significant improvements are observed in the end to end delay and the average throughput by applying BO-RL on a router. © 2021, The Author(s), under exclusive license to Springer Nature Switzerland AG.